1
|
DiNino M. Age and masking effects on acoustic cues for vowel categorizationa). JASA EXPRESS LETTERS 2024; 4:060001. [PMID: 38884558 DOI: 10.1121/10.0026371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 05/25/2024] [Indexed: 06/18/2024]
Abstract
Age-related changes in auditory processing may reduce physiological coding of acoustic cues, contributing to older adults' difficulty perceiving speech in background noise. This study investigated whether older adults differed from young adults in patterns of acoustic cue weighting for categorizing vowels in quiet and in noise. All participants relied primarily on spectral quality to categorize /ɛ/ and /æ/ sounds under both listening conditions. However, relative to young adults, older adults exhibited greater reliance on duration and less reliance on spectral quality. These results suggest that aging alters patterns of perceptual cue weights that may influence speech recognition abilities.
Collapse
|
2
|
Xie Z, Gaskins CR, Tinnemore AR, Shader MJ, Gordon-Salant S, Anderson S, Goupell MJ. Spectral degradation and carrier sentences increase age-related temporal processing deficits in a cue-specific manner. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:3983-3994. [PMID: 38934563 PMCID: PMC11213620 DOI: 10.1121/10.0026434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 05/09/2024] [Accepted: 05/25/2024] [Indexed: 06/28/2024]
Abstract
Advancing age is associated with decreased sensitivity to temporal cues in word segments, particularly when target words follow non-informative carrier sentences or are spectrally degraded (e.g., vocoded to simulate cochlear-implant stimulation). This study investigated whether age, carrier sentences, and spectral degradation interacted to cause undue difficulty in processing speech temporal cues. Younger and older adults with normal hearing performed phonemic categorization tasks on two continua: a Buy/Pie contrast with voice onset time changes for the word-initial stop and a Dish/Ditch contrast with silent interval changes preceding the word-final fricative. Target words were presented in isolation or after non-informative carrier sentences, and were unprocessed or degraded via sinewave vocoding (2, 4, and 8 channels). Older listeners exhibited reduced sensitivity to both temporal cues compared to younger listeners. For the Buy/Pie contrast, age, carrier sentence, and spectral degradation interacted such that the largest age effects were seen for unprocessed words in the carrier sentence condition. This pattern differed from the Dish/Ditch contrast, where reducing spectral resolution exaggerated age effects, but introducing carrier sentences largely left the patterns unchanged. These results suggest that certain temporal cues are particularly susceptible to aging when placed in sentences, likely contributing to the difficulties of older cochlear-implant users in everyday environments.
Collapse
Affiliation(s)
- Zilong Xie
- School of Communication Science and Disorders, Florida State University, Tallahassee, Florida 32306, USA
| | - Casey R Gaskins
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Anna R Tinnemore
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland 20742, USA
| | - Maureen J Shader
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | - Sandra Gordon-Salant
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland 20742, USA
| | - Samira Anderson
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
3
|
de Jong TJ, Hakkesteegt MM, van der Schroeff MP, Vroegop JL. Communicating Emotion: Vocal Expression of Linguistic and Emotional Prosody in Children With Mild to Profound Hearing Loss Compared With That of Normal Hearing Peers. Ear Hear 2024; 45:72-80. [PMID: 37316994 PMCID: PMC10718210 DOI: 10.1097/aud.0000000000001399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 06/01/2023] [Indexed: 06/16/2023]
Abstract
OBJECTIVES Emotional prosody is known to play an important role in social communication. Research has shown that children with cochlear implants (CCIs) may face challenges in their ability to express prosody, as their expressions may have less distinct acoustic contrasts and therefore may be judged less accurately. The prosody of children with milder degrees of hearing loss, wearing hearing aids, has sparsely been investigated. More understanding of the prosodic expression by children with hearing loss, hearing aid users in particular, could create more awareness among healthcare professionals and parents on limitations in social communication, which awareness may lead to more targeted rehabilitation. This study aimed to compare the prosodic expression potential of children wearing hearing aids (CHA) with that of CCIs and children with normal hearing (CNH). DESIGN In this prospective experimental study, utterances of pediatric hearing aid users, cochlear implant users, and CNH containing emotional expressions (happy, sad, and angry) were recorded during a reading task. Of the utterances, three acoustic properties were calculated: fundamental frequency (F0), variance in fundamental frequency (SD of F0), and intensity. Acoustic properties of the utterances were compared within subjects and between groups. RESULTS A total of 75 children were included (CHA: 26, CCI: 23, and CNH: 26). Participants were between 7 and 13 years of age. The 15 CCI with congenital hearing loss had received the cochlear implant at median age of 8 months. The acoustic patterns of emotions uttered by CHA were similar to those of CCI and CNH. Only in CCI, we found no difference in F0 variation between happiness and anger, although an intensity difference was present. In addition, CCI and CHA produced poorer happy-sad contrasts than did CNH. CONCLUSIONS The findings of this study suggest that on a fundamental, acoustic level, both CHA and CCI have a prosodic expression potential that is almost on par with normal hearing peers. However, there were some minor limitations observed in the prosodic expression of these children, it is important to determine whether these differences are perceptible to listeners and could affect social communication. This study sets the groundwork for more research that will help us fully understand the implications of these findings and how they may affect the communication abilities of these children. With a clearer understanding of these factors, we can develop effective ways to help improve their communication skills.
Collapse
Affiliation(s)
- Tjeerd J. de Jong
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Marieke M. Hakkesteegt
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Marc P. van der Schroeff
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Jantien L. Vroegop
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| |
Collapse
|
4
|
Cychosz M, Xu K, Fu QJ. Effects of spectral smearing on speech understanding and masking release in simulated bilateral cochlear implants. PLoS One 2023; 18:e0287728. [PMID: 37917727 PMCID: PMC10621938 DOI: 10.1371/journal.pone.0287728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 06/11/2023] [Indexed: 11/04/2023] Open
Abstract
Differences in spectro-temporal degradation may explain some variability in cochlear implant users' speech outcomes. The present study employs vocoder simulations on listeners with typical hearing to evaluate how differences in degree of channel interaction across ears affects spatial speech recognition. Speech recognition thresholds and spatial release from masking were measured in 16 normal-hearing subjects listening to simulated bilateral cochlear implants. 16-channel sine-vocoded speech simulated limited, broad, or mixed channel interaction, in dichotic and diotic target-masker conditions, across ears. Thresholds were highest with broad channel interaction in both ears but improved when interaction decreased in one ear and again in both ears. Masking release was apparent across conditions. Results from this simulation study on listeners with typical hearing show that channel interaction may impact speech recognition more than masking release, and may have implications for the effects of channel interaction on cochlear implant users' speech recognition outcomes.
Collapse
Affiliation(s)
- Margaret Cychosz
- Department of Linguistics, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Kevin Xu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States of America
| |
Collapse
|
5
|
Chiossi JSC, Patou F, Ng EHN, Faulkner KF, Lyxell B. Phonological discrimination and contrast detection in pupillometry. Front Psychol 2023; 14:1232262. [PMID: 38023001 PMCID: PMC10646334 DOI: 10.3389/fpsyg.2023.1232262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 10/12/2023] [Indexed: 12/01/2023] Open
Abstract
Introduction The perception of phonemes is guided by both low-level acoustic cues and high-level linguistic context. However, differentiating between these two types of processing can be challenging. In this study, we explore the utility of pupillometry as a tool to investigate both low- and high-level processing of phonological stimuli, with a particular focus on its ability to capture novelty detection and cognitive processing during speech perception. Methods Pupillometric traces were recorded from a sample of 22 Danish-speaking adults, with self-reported normal hearing, while performing two phonological-contrast perception tasks: a nonword discrimination task, which included minimal-pair combinations specific to the Danish language, and a nonword detection task involving the detection of phonologically modified words within sentences. The study explored the perception of contrasts in both unprocessed speech and degraded speech input, processed with a vocoder. Results No difference in peak pupil dilation was observed when the contrast occurred between two isolated nonwords in the nonword discrimination task. For unprocessed speech, higher peak pupil dilations were measured when phonologically modified words were detected within a sentence compared to sentences without the nonwords. For vocoded speech, higher peak pupil dilation was observed for sentence stimuli, but not for the isolated nonwords, although performance decreased similarly for both tasks. Conclusion Our findings demonstrate the complexity of pupil dynamics in the presence of acoustic and phonological manipulation. Pupil responses seemed to reflect higher-level cognitive and lexical processing related to phonological perception rather than low-level perception of acoustic cues. However, the incorporation of multiple talkers in the stimuli, coupled with the relatively low task complexity, may have affected the pupil dilation.
Collapse
Affiliation(s)
- Julia S. C. Chiossi
- Oticon A/S, Smørum, Denmark
- Department of Special Needs Education, University of Oslo, Oslo, Norway
| | | | - Elaine Hoi Ning Ng
- Oticon A/S, Smørum, Denmark
- Department of Behavioural Sciences and Learning, Linnaeus Centre HEAD, Swedish Institute for Disability Research, Linköping University, Linköping, Sweden
| | | | - Björn Lyxell
- Department of Special Needs Education, University of Oslo, Oslo, Norway
| |
Collapse
|
6
|
Ou J, Xiang M, Yu ACL. Individual variability in subcortical neural encoding shapes phonetic cue weighting. Sci Rep 2023; 13:9991. [PMID: 37340072 DOI: 10.1038/s41598-023-37212-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/18/2023] [Indexed: 06/22/2023] Open
Abstract
Recent studies have revealed great individual variability in cue weighting, and such variation is shown to be systematic across individuals and linked to differences in some general cognitive mechanism. The present study investigated the role of subcortical encoding as a source of individual variability in cue weighting by focusing on English listeners' frequency following responses to the tense/lax English vowel contrast varying in spectral and durational cues. Listeners differed in early auditory encoding with some encoding the spectral cue more veridically than the durational one, while others exhibited the reverse pattern. These differences in cue encoding further correlate with behavioral variability in cue weighting, suggesting that specificity in cue encoding across individuals modulates how cues are weighted in downstream processes.
Collapse
Affiliation(s)
- Jinghua Ou
- Department of Linguistics, University of Chicago, 1115 E. 58Th Street, Chicago, IL, 60637, USA.
| | - Ming Xiang
- Department of Linguistics, University of Chicago, 1115 E. 58Th Street, Chicago, IL, 60637, USA
| | - Alan C L Yu
- Department of Linguistics, University of Chicago, 1115 E. 58Th Street, Chicago, IL, 60637, USA.
| |
Collapse
|
7
|
Fleming JT, Winn MB. Strategic perceptual weighting of acoustic cues for word stress in listeners with cochlear implants, acoustic hearing, or simulated bimodal hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1300. [PMID: 36182279 PMCID: PMC9439712 DOI: 10.1121/10.0013890] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 08/08/2022] [Accepted: 08/16/2022] [Indexed: 05/28/2023]
Abstract
Perception of word stress is an important aspect of recognizing speech, guiding the listener toward candidate words based on the perceived stress pattern. Cochlear implant (CI) signal processing is likely to disrupt some of the available cues for word stress, particularly vowel quality and pitch contour changes. In this study, we used a cue weighting paradigm to investigate differences in stress cue weighting patterns between participants listening with CIs and those with normal hearing (NH). We found that participants with CIs gave less weight to frequency-based pitch and vowel quality cues than NH listeners but compensated by upweighting vowel duration and intensity cues. Nonetheless, CI listeners' stress judgments were also significantly influenced by vowel quality and pitch, and they modulated their usage of these cues depending on the specific word pair in a manner similar to NH participants. In a series of separate online experiments with NH listeners, we simulated aspects of bimodal hearing by combining low-pass filtered speech with a vocoded signal. In these conditions, participants upweighted pitch and vowel quality cues relative to a fully vocoded control condition, suggesting that bimodal listening holds promise for restoring the stress cue weighting patterns exhibited by listeners with NH.
Collapse
Affiliation(s)
- Justin T Fleming
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
8
|
He S, Skidmore J, Conroy S, Riggs WJ, Carter BL, Xie R. Neural Adaptation of the Electrically Stimulated Auditory Nerve Is Not Affected by Advanced Age in Postlingually Deafened, Middle-aged, and Elderly Adult Cochlear Implant Users. Ear Hear 2022; 43:1228-1244. [PMID: 34999595 PMCID: PMC9232840 DOI: 10.1097/aud.0000000000001184] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
OBJECTIVE This study aimed to investigate the associations between advanced age and the amount and the speed of neural adaptation of the electrically stimulated auditory nerve (AN) in postlingually deafened adult cochlear implant (CI) users. DESIGN Study participants included 26 postlingually deafened adult CI users, ranging in age between 28.7 and 84.0 years (mean: 63.8 years, SD: 14.4 years) at the time of testing. All study participants used a Cochlear Nucleus device with a full electrode array insertion in the test ear. The stimulus was a 100-ms pulse train with a pulse rate of 500, 900, 1800, or 2400 pulses per second (pps) per channel. The stimulus was presented at the maximum comfortable level measured at 2400 pps with a presentation rate of 2 Hz. Neural adaptation of the AN was evaluated using electrophysiological measures of the electrically evoked compound action potential (eCAP). The amount of neural adaptation was quantified by the adaptation index (AI) within three time windows: around 0 to 8 ms (window 1), 44 to 50 ms (window 2), and 94 to 100 ms (window 3). The speed of neural adaptation was quantified using a two-parameter power law estimation. In 23 participants, four electrodes across the electrode array were tested. In three participants, three electrodes were tested. Results measured at different electrode locations were averaged for each participant at each pulse rate to get an overall representation of neural adaptation properties of the AN across the cochlea. Linear-mixed models (LMMs) were used (1) to evaluate the effects of age at testing and pulse rate on the speed of neural adaptation and (2) to assess the effects of age at testing, pulse rate, and duration of stimulation (i.e., time window) on the amount of neural adaptation in these participants. RESULTS There was substantial variability in both the amount and the speed of neural adaptation of the AN among study participants. The amount and the speed of neural adaptation increased at higher pulse rates. In addition, larger amounts of adaptation were observed for longer durations of stimulation. There was no significant effect of age on the speed or the amount of neural adaptation. CONCLUSIONS The amount and the speed of neural adaptation of the AN are affected by both the pulse rate and the duration of stimulation, with higher pulse rates and longer durations of stimulation leading to faster and greater neural adaptation. Advanced age does not affect neural adaptation of the AN in postlingually deafened, middle-aged and elderly adult CI users.
Collapse
Affiliation(s)
- Shuman He
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212
- Department of Audiology, Nationwide Children’s Hospital, 700 Children’s Drive, Columbus, OH 43205
| | - Jeffrey Skidmore
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212
| | - Sara Conroy
- Center for Biostatistics, Department of Bioinformatics, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210
| | - William J. Riggs
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212
- Department of Audiology, Nationwide Children’s Hospital, 700 Children’s Drive, Columbus, OH 43205
| | - Brittney L. Carter
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212
| | - Ruili Xie
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212
| |
Collapse
|
9
|
Tamati TN, Sevich VA, Clausing EM, Moberly AC. Lexical Effects on the Perceived Clarity of Noise-Vocoded Speech in Younger and Older Listeners. Front Psychol 2022; 13:837644. [PMID: 35432072 PMCID: PMC9010567 DOI: 10.3389/fpsyg.2022.837644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 02/16/2022] [Indexed: 11/13/2022] Open
Abstract
When listening to degraded speech, such as speech delivered by a cochlear implant (CI), listeners make use of top-down linguistic knowledge to facilitate speech recognition. Lexical knowledge supports speech recognition and enhances the perceived clarity of speech. Yet, the extent to which lexical knowledge can be used to effectively compensate for degraded input may depend on the degree of degradation and the listener's age. The current study investigated lexical effects in the compensation for speech that was degraded via noise-vocoding in younger and older listeners. In an online experiment, younger and older normal-hearing (NH) listeners rated the clarity of noise-vocoded sentences on a scale from 1 ("very unclear") to 7 ("completely clear"). Lexical information was provided by matching text primes and the lexical content of the target utterance. Half of the sentences were preceded by a matching text prime, while half were preceded by a non-matching prime. Each sentence also consisted of three key words of high or low lexical frequency and neighborhood density. Sentences were processed to simulate CI hearing, using an eight-channel noise vocoder with varying filter slopes. Results showed that lexical information impacted the perceived clarity of noise-vocoded speech. Noise-vocoded speech was perceived as clearer when preceded by a matching prime, and when sentences included key words with high lexical frequency and low neighborhood density. However, the strength of the lexical effects depended on the level of degradation. Matching text primes had a greater impact for speech with poorer spectral resolution, but lexical content had a smaller impact for speech with poorer spectral resolution. Finally, lexical information appeared to benefit both younger and older listeners. Findings demonstrate that lexical knowledge can be employed by younger and older listeners in cognitive compensation during the processing of noise-vocoded speech. However, lexical content may not be as reliable when the signal is highly degraded. Clinical implications are that for adult CI users, lexical knowledge might be used to compensate for the degraded speech signal, regardless of age, but some CI users may be hindered by a relatively poor signal.
Collapse
Affiliation(s)
- Terrin N. Tamati
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH, United States
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Victoria A. Sevich
- Department of Speech and Hearing Science, The Ohio State University, Columbus, OH, United States
| | - Emily M. Clausing
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH, United States
| | - Aaron C. Moberly
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH, United States
| |
Collapse
|
10
|
Shader MJ, Kwon BJ, Gordon-Salant S, Goupell MJ. Open-Set Phoneme Recognition Performance With Varied Temporal Cues in Younger and Older Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1196-1211. [PMID: 35133853 PMCID: PMC9150732 DOI: 10.1044/2021_jslhr-21-00299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 09/20/2021] [Accepted: 11/12/2021] [Indexed: 06/14/2023]
Abstract
PURPOSE The goal of this study was to investigate the effect of age on phoneme recognition performance in which the stimuli varied in the amount of temporal information available in the signal. Chronological age is increasingly recognized as a factor that can limit the amount of benefit an individual can receive from a cochlear implant (CI). Central auditory temporal processing deficits in older listeners may contribute to the performance gap between younger and older CI users on recognition of phonemes varying in temporal cues. METHOD Phoneme recognition was measured at three stimulation rates (500, 900, and 1800 pulses per second) and two envelope modulation frequencies (50 Hz and unfiltered) in 20 CI participants ranging in age from 27 to 85 years. Speech stimuli were multiple word pairs differing in temporal contrasts and were presented via direct stimulation of the electrode array using an eight-channel continuous interleaved sampling strategy. Phoneme recognition performance was evaluated at each stimulation rate condition using both envelope modulation frequencies. RESULTS Duration of deafness was the strongest subject-level predictor of phoneme recognition, with participants with longer durations of deafness having poorer performance overall. Chronological age did not predict performance for any stimulus condition. Additionally, duration of deafness interacted with envelope filtering. Participants with shorter durations of deafness were able to take advantage of higher frequency envelope modulations, while participants with longer durations of deafness were not. CONCLUSIONS Age did not significantly predict phoneme recognition performance. In contrast, longer durations of deafness were associated with a reduced ability to utilize available temporal information within the signal to improve phoneme recognition performance.
Collapse
Affiliation(s)
- Maureen J. Shader
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN
| | | | | | - Matthew J. Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| |
Collapse
|
11
|
Xie Z, Anderson S, Goupell MJ. Stimulus context affects the phonemic categorization of temporally based word contrasts in adult cochlear-implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:2149. [PMID: 35364963 PMCID: PMC8957389 DOI: 10.1121/10.0009838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 02/20/2022] [Accepted: 03/04/2022] [Indexed: 06/14/2023]
Abstract
Cochlear-implant (CI) users rely heavily on temporal envelope cues for speech understanding. This study examined whether their sensitivity to temporal cues in word segments is affected when the words are preceded by non-informative carrier sentences. Thirteen adult CI users performed phonemic categorization tasks that present primarily temporally based word contrasts: Buy-Pie contrast with word-initial stop of varying voice-onset time (VOT), and Dish-Ditch contrast with varying silent intervals preceding the word-final fricative. These words were presented in isolation or were preceded by carrier stimuli including a sentence, a sentence-envelope-modulated noise, or an unmodulated speech-shaped noise. While participants were able to categorize both word contrasts, stimulus context effects were observed primarily for the Buy-Pie contrast, such that participants reported more "Buy" responses for words with longer VOTs in conditions with carrier stimuli than in isolation. The two non-speech carrier stimuli yielded similar or even greater context effects than sentences. The context effects disappeared when target words were delayed from the carrier stimuli for ≥75 ms. These results suggest that stimulus contexts affect auditory temporal processing in CI users but the context effects appear to be cue-specific. The context effects may be governed by general auditory processes, not those specific to speech processing.
Collapse
Affiliation(s)
- Zilong Xie
- Department of Hearing and Speech, University of Kansas Medical Center, 3901 Rainbow Boulevard, Kansas City, Kansas 66160, USA
| | - Samira Anderson
- Department of Hearing and Speech Sciences, University of Maryland, 0100 Samuel J. LeFrak Hall, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, 0100 Samuel J. LeFrak Hall, College Park, Maryland 20742, USA
| |
Collapse
|
12
|
More Than Words: the Relative Roles of Prosody and Semantics in the Perception of Emotions in Spoken Language by Postlingual Cochlear Implant Users. Ear Hear 2022; 43:1378-1389. [PMID: 35030551 DOI: 10.1097/aud.0000000000001199] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES The processing of emotional speech calls for the perception and integration of semantic and prosodic cues. Although cochlear implants allow for significant auditory improvements, they are limited in the transmission of spectro-temporal fine-structure information that may not support the processing of voice pitch cues. The goal of the current study is to compare the performance of postlingual cochlear implant (CI) users and a matched control group on perception, selective attention, and integration of emotional semantics and prosody. DESIGN Fifteen CI users and 15 normal hearing (NH) peers (age range, 18-65 years) 1istened to spoken sentences composed of different combinations of four discrete emotions (anger, happiness, sadness, and neutrality) presented in prosodic and semantic channels-T-RES: Test for Rating Emotions in Speech. In three separate tasks, listeners were asked to attend to the sentence as a whole, thus integrating both speech channels (integration), or to focus on one channel only (rating of target emotion) and ignore the other (selective attention). Their task was to rate how much they agreed that the sentence conveyed each of the predefined emotions. In addition, all participants performed standard tests of speech perception. RESULTS When asked to focus on one channel, semantics or prosody, both groups rated emotions similarly with comparable levels of selective attention. When the task was called for channel integration, group differences were found. CI users appeared to use semantic emotional information more than did their NH peers. CI users assigned higher ratings than did their NH peers to sentences that did not present the target emotion, indicating some degree of confusion. In addition, for CI users, individual differences in speech comprehension over the phone and identification of intonation were significantly related to emotional semantic and prosodic ratings, respectively. CONCLUSIONS CI users and NH controls did not differ in perception of prosodic and semantic emotions and in auditory selective attention. However, when the task called for integration of prosody and semantics, CI users overused the semantic information (as compared with NH). We suggest that as CI users adopt diverse cue weighting strategies with device experience, their weighting of prosody and semantics differs from those used by NH. Finally, CI users may benefit from rehabilitation strategies that strengthen perception of prosodic information to better understand emotional speech.
Collapse
|
13
|
Arjmandi MK, Jahn KN, Arenberg JG. Single-Channel Focused Thresholds Relate to Vowel Identification in Pediatric and Adult Cochlear Implant Listeners. Trends Hear 2022; 26:23312165221095364. [PMID: 35505617 PMCID: PMC9073113 DOI: 10.1177/23312165221095364] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Speech recognition outcomes are highly variable among pediatric and adult cochlear implant (CI) listeners. Although there is some evidence that the quality of the electrode-neuron interface (ENI) contributes to this large variability in auditory perception, its relationship with speech outcomes is not well understood. Single-channel auditory detection thresholds measured in response to focused electrical fields (i.e., focused thresholds) are sensitive to properties of ENI quality, including electrode-neuron distance, intracochlear resistance, and neural health. In the present study, focused thresholds and speech perception abilities were assessed in 15 children and 21 adult CI listeners. Focused thresholds were measured for all active electrodes using a fast sweep procedure. Speech perception performance was evaluated by assessing listeners’ ability to identify vowels presented in /h-vowel-d/ context. Consistent with prior literature, focused thresholds were lower for children than for adults, but vowel identification did not differ significantly across age groups. Higher across-array average focused thresholds, which may indicate a relatively poor ENI quality, were associated with poorer vowel identification scores in both children and adults. Adult CI listeners with longer durations of deafness had higher focused thresholds. Findings from this study demonstrate that poor-quality ENIs may contribute to reduced speech outcomes for pediatric and adult CI listeners. Estimates of ENI quality (e.g., focused thresholds) may assist in developing customized programming interventions that serve to improve the transmission of spectral cues that are important in vowel identification.
Collapse
Affiliation(s)
- Meisam K Arjmandi
- Department of Otolaryngology - Head and Neck Surgery, 1811Harvard Medical School, Boston, MA, USA.,Eaton-Peabody Laboratories, 1866Massachusetts Eye and Ear, Boston, MA, USA.,Audiology Division, 1866Massachusetts Eye and Ear, Boston, MA, USA
| | - Kelly N Jahn
- Department of Otolaryngology - Head and Neck Surgery, 1811Harvard Medical School, Boston, MA, USA.,Eaton-Peabody Laboratories, 1866Massachusetts Eye and Ear, Boston, MA, USA.,Department of Speech, Language, and Hearing, University of Texas at Dallas, Richardson, TX, USA
| | - Julie G Arenberg
- Department of Otolaryngology - Head and Neck Surgery, 1811Harvard Medical School, Boston, MA, USA.,Eaton-Peabody Laboratories, 1866Massachusetts Eye and Ear, Boston, MA, USA.,Audiology Division, 1866Massachusetts Eye and Ear, Boston, MA, USA
| |
Collapse
|
14
|
Heffner CC, Jaekel BN, Newman RS, Goupell MJ. Accuracy and cue use in word segmentation for cochlear-implant listeners and normal-hearing listeners presented vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2936. [PMID: 34717484 PMCID: PMC8528550 DOI: 10.1121/10.0006448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 09/07/2021] [Accepted: 09/09/2021] [Indexed: 06/13/2023]
Abstract
Cochlear-implant (CI) listeners experience signal degradation, which leads to poorer speech perception than normal-hearing (NH) listeners. In the present study, difficulty with word segmentation, the process of perceptually parsing the speech stream into separate words, is considered as a possible contributor to this decrease in performance. CI listeners were compared to a group of NH listeners (presented with unprocessed speech and eight-channel noise-vocoded speech) in their ability to segment phrases with word segmentation ambiguities (e.g., "an iceman" vs "a nice man"). The results showed that CI listeners and NH listeners were worse at segmenting words when hearing processed speech than NH listeners were when presented with unprocessed speech. When viewed at a broad level, all of the groups used cues to word segmentation in similar ways. Detailed analyses, however, indicated that the two processed speech groups weighted top-down knowledge cues to word boundaries more and weighted acoustic cues to word boundaries less relative to NH listeners presented with unprocessed speech.
Collapse
Affiliation(s)
- Christopher C Heffner
- Program in Neuroscience and Cognitive Science, University of Maryland, College Park, Maryland 20742, USA
| | - Brittany N Jaekel
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Rochelle S Newman
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
15
|
Voice Emotion Recognition by Mandarin-Speaking Children with Cochlear Implants. Ear Hear 2021; 43:165-180. [PMID: 34288631 DOI: 10.1097/aud.0000000000001085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Objectives Emotional expressions are very important in social interactions. Children with cochlear implants can have voice emotion recognition deficits due to device limitations. Mandarin-speaking children with cochlear implants may face greater challenges than those speaking nontonal languages; the pitch information is not well preserved in cochlear implants, and such children could benefit from child-directed speech, which carries more exaggerated distinctive acoustic cues for different emotions. This study investigated voice emotion recognition, using both adult-directed and child-directed materials, in Mandarin-speaking children with cochlear implants compared with normal hearing peers. The authors hypothesized that both the children with cochlear implants and those with normal hearing would perform better with child-directed materials than with adult-directed materials. Design Thirty children (7.17-17 years of age) with cochlear implants and 27 children with normal hearing (6.92-17.08 years of age) were recruited in this study. Participants completed a nonverbal reasoning test, speech recognition tests, and a voice emotion recognition task. Children with cochlear implants over the age of 10 years also completed the Chinese version of the Nijmegen Cochlear Implant Questionnaire to evaluate the health-related quality of life. The voice emotion recognition task was a five-alternative, forced-choice paradigm, which contains sentences spoken with five emotions (happy, angry, sad, scared, and neutral) in a child-directed or adult-directed manner. Results Acoustic analyses showed substantial variations across emotions in all materials, mainly on measures of mean fundamental frequency and fundamental frequency range. Mandarin-speaking children with cochlear implants displayed a significantly poorer performance than normal hearing peers in voice emotion perception tasks, regardless of whether the performance is measured in accuracy scores, Hu value, or reaction time. Children with cochlear implants and children with normal hearing were mainly affected by the mean fundamental frequency in speech emotion recognition tasks. Chronological age had a significant effect on speech emotion recognition in children with normal hearing; however, there was no significant correlation between chronological age and accuracy scores in speech emotion recognition in children with implants. Significant effects of specific emotion and test materials (better performance with child-directed materials) in both groups of children were observed. Among the children with cochlear implants, age at implantation, percentage scores of nonverbal intelligence quotient test, and sentence recognition threshold in quiet could predict recognition performance in both accuracy scores and Hu values. Time wearing cochlear implant could predict reaction time in emotion perception tasks among children with cochlear implants. No correlation was observed between the accuracy score in voice emotion perception and the self-reported scores of health-related quality of life; however, the latter were significantly correlated with speech recognition skills among Mandarin-speaking children with cochlear implants. Conclusions Mandarin-speaking children with cochlear implants could have significant deficits in voice emotion recognition tasks compared with their normally hearing peers and can benefit from the exaggerated prosody of child-directed speech. The effects of age at cochlear implantation, speech and language development, and cognition could play an important role in voice emotion perception by Mandarin-speaking children with cochlear implants.
Collapse
|
16
|
Feng L, Oxenham AJ. Spectral Contrast Effects Reveal Different Acoustic Cues for Vowel Recognition in Cochlear-Implant Users. Ear Hear 2021; 41:990-997. [PMID: 31815819 PMCID: PMC7874522 DOI: 10.1097/aud.0000000000000820] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The identity of a speech sound can be affected by the spectrum of a preceding stimulus in a contrastive manner. Although such aftereffects are often reduced in people with hearing loss and cochlear implants (CIs), one recent study demonstrated larger spectral contrast effects in CI users than in normal-hearing (NH) listeners. The present study aimed to shed light on this puzzling finding. We hypothesized that poorer spectral resolution leads CI users to rely on different acoustic cues not only to identify speech sounds but also to adapt to the context. DESIGN Thirteen postlingually deafened adult CI users and 33 NH participants (listening to either vocoded or unprocessed speech) participated in this study. Psychometric functions were estimated in a vowel categorization task along the /I/ to /ε/ (as in "bit" and "bet") continuum following a context sentence, the long-term average spectrum of which was manipulated at the level of either fine-grained local spectral cues or coarser global spectral cues. RESULTS In NH listeners with unprocessed speech, the aftereffect was determined solely by the fine-grained local spectral cues, resulting in a surprising insensitivity to the larger, global spectral cues utilized by CI users. Restricting the spectral resolution available to NH listeners via vocoding resulted in patterns of responses more similar to those found in CI users. However, the size of the contrast aftereffect remained smaller in NH listeners than in CI users. CONCLUSIONS Only the spectral contrasts used by listeners contributed to the spectral contrast effects in vowel identification. These results explain why CI users can experience larger-than-normal context effects under specific conditions. The results also suggest that adaptation to new spectral cues can be very rapid for vowel discrimination, but may follow a longer time course to influence spectral contrast effects.
Collapse
Affiliation(s)
- Lei Feng
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota, USA
| | | |
Collapse
|
17
|
Murr AT, Canfarotta MW, O'Connell BP, Buss E, King ER, Bucker AL, Dillon SA, Rooth MA, Dedmon MM, Brown KD, Dillon MT. Speech Recognition as a Function of Age and Listening Experience in Adult Cochlear Implant Users. Laryngoscope 2021; 131:2106-2111. [PMID: 34043247 DOI: 10.1002/lary.29663] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 05/04/2021] [Accepted: 05/19/2021] [Indexed: 12/21/2022]
Abstract
OBJECTIVES/HYPOTHESIS Speech recognition with a cochlear implant (CI) tends to be better for younger adults than older adults. However, older adults may take longer to reach asymptotic performance than younger adults. The present study aimed to characterize speech recognition as a function of age at implantation and listening experience for adult CI users. STUDY DESIGN Retrospective review. METHODS A retrospective review identified 352 adult CI recipients (387 ears) with at least 5 years of device listening experience. Speech recognition, as measured with consonant-nucleus-consonant (CNC) words in quiet and AzBio sentences in a 10-talker noise masker (10 dB signal-to-noise ratio), was reviewed at 1, 5, and 10 years postactivation. RESULTS Speech recognition was better in younger listeners, and performance was stable or continued to improve through 10 years of CI listening experience. There was no indication of differences in acclimatization as a function of age at implantation. For the better performing CI recipients, an effect of age at implantation was more apparent for sentence recognition in noise than for word recognition in quiet. CONCLUSIONS Adult CI recipients across the age range examined here experience speech recognition benefit with a CI. However, older adults perform more poorly than young adults for speech recognition in quiet and noise, with similar age effects through 5 to 10 years of listening experience. LEVEL OF EVIDENCE 3 Laryngoscope, 131:2106-2111, 2021.
Collapse
Affiliation(s)
- Alexander T Murr
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Michael W Canfarotta
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Brendan P O'Connell
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - English R King
- Department of Audiology, University of North Carolina Health Care, Chapel Hill, North Carolina, U.S.A
| | - Andrea L Bucker
- Department of Audiology, University of North Carolina Health Care, Chapel Hill, North Carolina, U.S.A
| | - Sarah A Dillon
- Department of Audiology, University of North Carolina Health Care, Chapel Hill, North Carolina, U.S.A
| | - Meredith A Rooth
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Matthew M Dedmon
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Kevin D Brown
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Margaret T Dillon
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
18
|
Individual Variability in Recalibrating to Spectrally Shifted Speech: Implications for Cochlear Implants. Ear Hear 2021; 42:1412-1427. [PMID: 33795617 DOI: 10.1097/aud.0000000000001043] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Cochlear implant (CI) recipients are at a severe disadvantage compared with normal-hearing listeners in distinguishing consonants that differ by place of articulation because the key relevant spectral differences are degraded by the implant. One component of that degradation is the upward shifting of spectral energy that occurs with a shallow insertion depth of a CI. The present study aimed to systematically measure the effects of spectral shifting on word recognition and phoneme categorization by specifically controlling the amount of shifting and using stimuli whose identification specifically depends on perceiving frequency cues. We hypothesized that listeners would be biased toward perceiving phonemes that contain higher-frequency components because of the upward frequency shift and that intelligibility would decrease as spectral shifting increased. DESIGN Normal-hearing listeners (n = 15) heard sine wave-vocoded speech with simulated upward frequency shifts of 0, 2, 4, and 6 mm of cochlear space to simulate shallow CI insertion depth. Stimuli included monosyllabic words and /b/-/d/ and /∫/-/s/ continua that varied systematically by formant frequency transitions or frication noise spectral peaks, respectively. Recalibration to spectral shifting was operationally defined as shifting perceptual acoustic-phonetic mapping commensurate with the spectral shift. In other words, adjusting frequency expectations for both phonemes upward so that there is still a perceptual distinction, rather than hearing all upward-shifted phonemes as the higher-frequency member of the pair. RESULTS For moderate amounts of spectral shifting, group data suggested a general "halfway" recalibration to spectral shifting, but individual data suggested a notably different conclusion: half of the listeners were able to recalibrate fully, while the other halves of the listeners were utterly unable to categorize shifted speech with any reliability. There were no participants who demonstrated a pattern intermediate to these two extremes. Intelligibility of words decreased with greater amounts of spectral shifting, also showing loose clusters of better- and poorer-performing listeners. Phonetic analysis of word errors revealed certain cues were more susceptible to being compromised due to a frequency shift (place and manner of articulation), while voicing was robust to spectral shifting. CONCLUSIONS Shifting the frequency spectrum of speech has systematic effects that are in line with known properties of speech acoustics, but the ensuing difficulties cannot be predicted based on tonotopic mismatch alone. Difficulties are subject to substantial individual differences in the capacity to adjust acoustic-phonetic mapping. These results help to explain why speech recognition in CI listeners cannot be fully predicted by peripheral factors like electrode placement and spectral resolution; even among listeners with functionally equivalent auditory input, there is an additional factor of simply being able or unable to flexibly adjust acoustic-phonetic mapping. This individual variability could motivate precise treatment approaches guided by an individual's relative reliance on wideband frequency representation (even if it is mismatched) or limited frequency coverage whose tonotopy is preserved.
Collapse
|
19
|
Nogueira W, Boghdady NE, Langner F, Gaudrain E, Başkent D. Effect of Channel Interaction on Vocal Cue Perception in Cochlear Implant Users. Trends Hear 2021; 25:23312165211030166. [PMID: 34461780 PMCID: PMC8411629 DOI: 10.1177/23312165211030166] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 06/14/2021] [Accepted: 06/16/2021] [Indexed: 11/16/2022] Open
Abstract
Speech intelligibility in multitalker settings is challenging for most cochlear implant (CI) users. One possibility for this limitation is the suboptimal representation of vocal cues in implant processing, such as the fundamental frequency (F0), and the vocal tract length (VTL). Previous studies suggested that while F0 perception depends on spectrotemporal cues, VTL perception relies largely on spectral cues. To investigate how spectral smearing in CIs affects vocal cue perception in speech-on-speech (SoS) settings, adjacent electrodes were simultaneously stimulated using current steering in 12 Advanced Bionics users to simulate channel interaction. In current steering, two adjacent electrodes are simultaneously stimulated forming a channel of parallel stimulation. Three such stimulation patterns were used: Sequential (one current steering channel), Paired (two channels), and Triplet stimulation (three channels). F0 and VTL just-noticeable differences (JNDs; Task 1), in addition to SoS intelligibility (Task 2) and comprehension (Task 3), were measured for each stimulation strategy. In Tasks 2 and 3, four maskers were used: the same female talker, a male voice obtained by manipulating both F0 and VTL (F0+VTL) of the original female speaker, a voice where only F0 was manipulated, and a voice where only VTL was manipulated. JNDs were measured relative to the original voice for the F0, VTL, and F0+VTL manipulations. When spectral smearing was increased from Sequential to Triplet, a significant deterioration in performance was observed for Tasks 1 and 2, with no differences between Sequential and Paired stimulation. Data from Task 3 were inconclusive. These results imply that CI users may tolerate certain amounts of channel interaction without significant reduction in performance on tasks relying on voice perception. This points to possibilities for using parallel stimulation in CIs for reducing power consumption.
Collapse
Affiliation(s)
- Waldo Nogueira
- Department of Otolaryngology, Medical University
Hannover and Cluster of Excellence Hearing4all, Hanover, Germany
| | - Nawal El Boghdady
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
| | - Florian Langner
- Department of Otolaryngology, Medical University
Hannover and Cluster of Excellence Hearing4all, Hanover, Germany
| | - Etienne Gaudrain
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
- Lyon Neuroscience Research Center, CNRS UMR 5292,
INSERM U1028, University Lyon 1, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
| |
Collapse
|
20
|
Xie Z, Gaskins CR, Shader MJ, Gordon-Salant S, Anderson S, Goupell MJ. Age-Related Temporal Processing Deficits in Word Segments in Adult Cochlear-Implant Users. Trends Hear 2020; 23:2331216519886688. [PMID: 31808373 PMCID: PMC6900735 DOI: 10.1177/2331216519886688] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Aging may limit speech understanding outcomes in cochlear-implant (CI) users.
Here, we examined age-related declines in auditory temporal processing as a
potential mechanism that underlies speech understanding deficits associated with
aging in CI users. Auditory temporal processing was assessed with a
categorization task for the words dish and ditch (i.e., identify each token as
the word dish or ditch) on a continuum of
speech tokens with varying silence duration (0 to 60 ms) prior to the final
fricative. In Experiments 1 and 2, younger CI (YCI), middle-aged CI (MCI), and
older CI (OCI) users participated in the categorization task across a range of
presentation levels (25 to 85 dB). Relative to YCI, OCI required longer silence
durations to identify ditch and exhibited reduced ability to distinguish the
words dish and ditch (shallower slopes in the categorization function).
Critically, we observed age-related performance differences only at higher
presentation levels. This contrasted with findings from normal-hearing listeners
in Experiment 3 that demonstrated age-related performance differences
independent of presentation level. In summary, aging in CI users appears to
degrade the ability to utilize brief temporal cues in word identification,
particularly at high levels. Age-specific CI programming may potentially improve
clinical outcomes for speech understanding performance by older CI
listeners.
Collapse
Affiliation(s)
- Zilong Xie
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Casey R Gaskins
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Maureen J Shader
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Sandra Gordon-Salant
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Samira Anderson
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| |
Collapse
|
21
|
Winn MB, Moore AN. Perceptual weighting of acoustic cues for accommodating gender-related talker differences heard by listeners with normal hearing and with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:496. [PMID: 32873011 PMCID: PMC7402726 DOI: 10.1121/10.0001672] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 05/31/2020] [Accepted: 07/14/2020] [Indexed: 06/11/2023]
Abstract
Listeners must accommodate acoustic differences between vocal tracts and speaking styles of conversation partners-a process called normalization or accommodation. This study explores what acoustic cues are used to make this perceptual adjustment by listeners with normal hearing or with cochlear implants, when the acoustic variability is related to the talker's gender. A continuum between /ʃ/ and /s/ was paired with naturally spoken vocalic contexts that were parametrically manipulated to vary by numerous cues for talker gender including fundamental frequency (F0), vocal tract length (formant spacing), and direct spectral contrast with the fricative. The goal was to examine relative contributions of these cues toward the tendency to have a lower-frequency acoustic boundary for fricatives spoken by men (found in numerous previous studies). Normal hearing listeners relied primarily on formant spacing and much less on F0. The CI listeners were individually variable, with the F0 cue emerging as the strongest cue on average.
Collapse
Affiliation(s)
- Matthew B Winn
- Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Ashley N Moore
- Department of Speech & Hearing Sciences, University of Washington, Seattle, Washington 98105, USA
| |
Collapse
|
22
|
DiNino M, Arenberg JG, Duchen ALR, Winn MB. Effects of Age and Cochlear Implantation on Spectrally Cued Speech Categorization. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2425-2440. [PMID: 32552327 PMCID: PMC7838840 DOI: 10.1044/2020_jslhr-19-00127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 08/12/2019] [Accepted: 03/30/2020] [Indexed: 06/11/2023]
Abstract
Purpose Weighting of acoustic cues for perceiving place-of-articulation speech contrasts was measured to determine the separate and interactive effects of age and use of cochlear implants (CIs). It has been found that adults with normal hearing (NH) show reliance on fine-grained spectral information (e.g., formants), whereas adults with CIs show reliance on broad spectral shape (e.g., spectral tilt). In question was whether children with NH and CIs would demonstrate the same patterns as adults, or show differences based on ongoing maturation of hearing and phonetic skills. Method Children and adults with NH and with CIs categorized a /b/-/d/ speech contrast based on two orthogonal spectral cues. Among CI users, phonetic cue weights were compared to vowel identification scores and Spectral-Temporally Modulated Ripple Test thresholds. Results NH children and adults both relied relatively more on the fine-grained formant cue and less on the broad spectral tilt cue compared to participants with CIs. However, early-implanted children with CIs better utilized the formant cue compared to adult CI users. Formant cue weights correlated with CI participants' vowel recognition and in children, also related to Spectral-Temporally Modulated Ripple Test thresholds. Adults and child CI users with very poor phonetic perception showed additive use of the two cues, whereas those with better and/or more mature cue usage showed a prioritized trading relationship, akin to NH listeners. Conclusions Age group and hearing modality can influence phonetic cue-weighting patterns. Results suggest that simple nonlexical categorization tests correlate with more general speech recognition skills of children and adults with CIs.
Collapse
Affiliation(s)
- Mishaela DiNino
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA
| | - Julie G. Arenberg
- Massachusetts Eye and Ear, Harvard Medical School Department of Otolaryngology, Boston
| | | | - Matthew B. Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis
| |
Collapse
|
23
|
Canfarotta MW, O'Connell BP, Buss E, Pillsbury HC, Brown KD, Dillon MT. Influence of Age at Cochlear Implantation and Frequency-to-Place Mismatch on Early Speech Recognition in Adults. Otolaryngol Head Neck Surg 2020; 162:926-932. [PMID: 32178574 PMCID: PMC8590812 DOI: 10.1177/0194599820911707] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
OBJECTIVE Default frequency filters of cochlear implant (CI) devices assign frequency information irrespective of intracochlear position, resulting in varying degrees of frequency-to-place mismatch. Substantial mismatch negatively influences speech recognition in postlingually deafened CI recipients, and acclimatization may be particularly challenging for older adults due to effects of aging on the auditory pathway. The present report investigated the influence of mismatch and age at implantation on speech recognition within the initial 6 months of CI use. STUDY DESIGN Retrospective review. SETTING Tertiary referral center. SUBJECTS AND METHODS Forty-eight postlingually deafened adult CI recipients of lateral wall electrode arrays underwent postoperative computed tomography to determine angular insertion depth of each electrode contact. Frequency-to-place mismatch was determined by comparing spiral ganglion place frequencies to default frequency filters. Consonant-nucleus-consonant (CNC) scores in the CI-alone condition at 1, 3, and 6 months postactivation were compared to the degree of mismatch at 1500 Hz and age at implantation. RESULTS Younger adult CI recipients experienced more rapid growth in speech recognition during the initial 6 months postactivation. Greater degrees of frequency-to-place mismatch were associated with poorer performance, yet older listeners were not particularly susceptible to this effect. CONCLUSIONS While older adults are not necessarily more sensitive to detrimental effects of frequency-to-place mismatch, other factors appear to limit early benefit with a CI in this population. These results suggest that minimizing mismatch could optimize outcomes in adult CI recipients across the life span, which may be particularly beneficial in the elderly considering auditory processing deficits associated with advanced age.
Collapse
Affiliation(s)
- Michael W Canfarotta
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Brendan P O'Connell
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Harold C Pillsbury
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kevin D Brown
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Margaret T Dillon
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
24
|
Pals C, Sarampalis A, Beynon A, Stainsby T, Başkent D. Effect of Spectral Channels on Speech Recognition, Comprehension, and Listening Effort in Cochlear-Implant Users. Trends Hear 2020; 24:2331216520904617. [PMID: 32189585 PMCID: PMC7082863 DOI: 10.1177/2331216520904617] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In favorable listening conditions, cochlear-implant (CI) users can reach high
speech recognition scores with as little as seven active electrodes. Here, we
hypothesized that even when speech recognition is high, additional spectral
channels may still benefit other aspects of speech perception, such as
comprehension and listening effort. Twenty-five adult, postlingually deafened CI
users, selected from two Dutch implant centers for high clinical word
identification scores, participated in two experiments. Experimental conditions
were created by varying the number of active electrodes of the CIs between 7 and
15. In Experiment 1, response times (RTs) on the secondary task in a dual-task
paradigm were used as an indirect measure of listening effort, and in Experiment
2, sentence verification task (SVT) accuracy and RTs were used to measure speech
comprehension and listening effort, respectively. Speech recognition was near
ceiling for all conditions tested, as intended by the design. However, the
dual-task paradigm failed to show the hypothesized decrease in RTs with
increasing spectral channels. The SVT did show a systematic improvement in both
speech comprehension and response speed across all conditions. In conclusion,
the SVT revealed additional benefits in both speech comprehension and listening
effort for conditions in which high speech recognition was already achieved.
Hence, adding spectral channels may provide benefits for CI listeners that may
not be reflected by traditional speech tests. The SVT is a relatively simple
task that is easy to implement and may therefore be a good candidate for
identifying such additional benefits in research or clinical settings.
Collapse
Affiliation(s)
- Carina Pals
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| | | | - Andy Beynon
- Department of Otorhinolaryngology, Head and Neck Surgery, Hearing and Implants, Radboud University Medical Centre, Nijmegen, the Netherlands
| | | | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| |
Collapse
|
25
|
McMurray B, Ellis TP, Apfelbaum KS. How Do You Deal With Uncertainty? Cochlear Implant Users Differ in the Dynamics of Lexical Processing of Noncanonical Inputs. Ear Hear 2020; 40:961-980. [PMID: 30531260 PMCID: PMC6551335 DOI: 10.1097/aud.0000000000000681] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Work in normal-hearing (NH) adults suggests that spoken language processing involves coping with ambiguity. Even a clearly spoken word contains brief periods of ambiguity as it unfolds over time, and early portions will not be sufficient to uniquely identify the word. However, beyond this temporary ambiguity, NH listeners must also cope with the loss of information due to reduced forms, dialect, and other factors. A recent study suggests that NH listeners may adapt to increased ambiguity by changing the dynamics of how they commit to candidates at a lexical level. Cochlear implant (CI) users must also frequently deal with highly degraded input, in which there is less information available in the input to recover a target word. The authors asked here whether their frequent experience with this leads to lexical dynamics that are better suited for coping with uncertainty. DESIGN Listeners heard words either correctly pronounced (dog) or mispronounced at onset (gog) or offset (dob). Listeners selected the corresponding picture from a screen containing pictures of the target and three unrelated items. While they did this, fixations to each object were tracked as a measure of the time course of identifying the target. The authors tested 44 postlingually deafened adult CI users in 2 groups (23 used standard electric only configurations, and 21 supplemented the CI with a hearing aid), along with 28 age-matched age-typical hearing (ATH) controls. RESULTS All three groups recognized the target word accurately, though each showed a small decrement for mispronounced forms (larger in both types of CI users). Analysis of fixations showed a close time locking to the timing of the mispronunciation. Onset mispronunciations delayed initial fixations to the target, but fixations to the target showed partial recovery by the end of the trial. Offset mispronunciations showed no effect early, but suppressed looking later. This pattern was attested in all three groups, though both types of CI users were slower and did not commit fully to the target. When the authors quantified the degree of disruption (by the mispronounced forms), they found that both groups of CI users showed less disruption than ATH listeners during the first 900 msec of processing. Finally, an individual differences analysis showed that within the CI users, the dynamics of fixations predicted speech perception outcomes over and above accuracy in this task and that CI users with the more rapid fixation patterns of ATH listeners showed better outcomes. CONCLUSIONS Postlingually deafened CI users process speech incrementally (as do ATH listeners), though they commit more slowly and less strongly to a single item than do ATH listeners. This may allow them to cope more flexible with mispronunciations.
Collapse
Affiliation(s)
- Bob McMurray
- Departments of Psychological and Brain Sciences, Communication Sciences and Disorders, Otolaryngology, University of Iowa, Iowa City, Iowa, USA
| | - Tyler P Ellis
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, Iowa, USA
| | - Keith S Apfelbaum
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, Iowa, USA
- Foundations in Learning, Inc., Coralville, Iowa, USA
| |
Collapse
|
26
|
Winn MB. Accommodation of gender-related phonetic differences by listeners with cochlear implants and in a variety of vocoder simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:174. [PMID: 32006986 PMCID: PMC7341679 DOI: 10.1121/10.0000566] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 12/06/2019] [Accepted: 12/13/2019] [Indexed: 06/01/2023]
Abstract
Speech perception requires accommodation of a wide range of acoustic variability across talkers. A classic example is the perception of "sh" and "s" fricative sounds, which are categorized according to spectral details of the consonant itself, and also by the context of the voice producing it. Because women's and men's voices occupy different frequency ranges, a listener is required to make a corresponding adjustment of acoustic-phonetic category space for these phonemes when hearing different talkers. This pattern is commonplace in everyday speech communication, and yet might not be captured in accuracy scores for whole words, especially when word lists are spoken by a single talker. Phonetic accommodation for fricatives "s" and "sh" was measured in 20 cochlear implant (CI) users and in a variety of vocoder simulations, including those with noise carriers with and without peak picking, simulated spread of excitation, and pulsatile carriers. CI listeners showed strong phonetic accommodation as a group. Each vocoder produced phonetic accommodation except the 8-channel noise vocoder, despite its historically good match with CI users in word intelligibility. Phonetic accommodation is largely independent of linguistic factors and thus might offer information complementary to speech intelligibility tests which are partially affected by language processing.
Collapse
Affiliation(s)
- Matthew B Winn
- Department of Speech & Hearing Sciences, University of Minnesota, 164 Pillsbury Drive Southeast, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
27
|
Gianakas SP, Winn MB. Lexical bias in word recognition by cochlear implant listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3373. [PMID: 31795696 PMCID: PMC6948217 DOI: 10.1121/1.5132938] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 10/04/2019] [Accepted: 10/14/2019] [Indexed: 06/03/2023]
Abstract
When hearing an ambiguous speech sound, listeners show a tendency to perceive it as a phoneme that would complete a real word, rather than completing a nonsense/fake word. For example, a sound that could be heard as either /b/ or /ɡ/ is perceived as /b/ when followed by _ack but perceived as /ɡ/ when followed by "_ap." Because the target sound is acoustically identical across both environments, this effect demonstrates the influence of top-down lexical processing in speech perception. Degradations in the auditory signal were hypothesized to render speech stimuli more ambiguous, and therefore promote increased lexical bias. Stimuli included three speech continua that varied by spectral cues of varying speeds, including stop formant transitions (fast), fricative spectra (medium), and vowel formants (slow). Stimuli were presented to listeners with cochlear implants (CIs), and also to listeners with normal hearing with clear spectral quality, or with varying amounts of spectral degradation using a noise vocoder. Results indicated an increased lexical bias effect with degraded speech and for CI listeners, for whom the effect size was related to segment duration. This method can probe an individual's reliance on top-down processing even at the level of simple lexical/phonetic perception.
Collapse
Affiliation(s)
- Steven P Gianakas
- Department of Speech-Language-Hearing Sciences, University of Minnesota, 164 Pillsbury Drive SE, Minneapolis, Minnesota 55455, USA
| | - Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, 164 Pillsbury Drive SE, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
28
|
Dingemanse JG, Goedegebure A. The Important Role of Contextual Information in Speech Perception in Cochlear Implant Users and Its Consequences in Speech Tests. Trends Hear 2019; 23:2331216519838672. [PMID: 30991904 PMCID: PMC6472157 DOI: 10.1177/2331216519838672] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
This study investigated the role of contextual information in speech
intelligibility, the influence of verbal working memory on the use of contextual
information, and the suitability of an ecologically valid sentence test
containing contextual information, compared with a CNC
(Consonant-Nucleus-Consonant) word test, in cochlear implant (CI) users. Speech
intelligibility performance was assessed in 50 postlingual adult CI users on
sentence lists and on CNC word lists. Results were compared with a
normal-hearing (NH) group. The influence of contextual information was
calculated from three different context models. Working memory capacity was
measured with a Reading Span Test. CI recipients made significantly more use of
contextual information in recognition of CNC words and sentences than NH
listeners. Their use of contextual information in sentences was related to
verbal working memory capacity but not to age, indicating that the ability to
use context is dependent on cognitive abilities, regardless of age. The presence
of context in sentences enhanced the sensitivity to differences in sensory
bottom-up information but also increased the risk of a ceiling effect. A
sentence test appeared to be suitable in CI users if word scoring is used and
noise is added for the best performers.
Collapse
Affiliation(s)
- J. Gertjan Dingemanse
- 1 Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - André Goedegebure
- 1 Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus University Medical Center, Rotterdam, the Netherlands
| |
Collapse
|
29
|
Chatterjee M, Kulkarni AM, Siddiqui RM, Christensen JA, Hozan M, Sis JL, Damm SA. Acoustics of Emotional Prosody Produced by Prelingually Deaf Children With Cochlear Implants. Front Psychol 2019; 10:2190. [PMID: 31632320 PMCID: PMC6779094 DOI: 10.3389/fpsyg.2019.02190] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 09/11/2019] [Indexed: 11/27/2022] Open
Abstract
Purpose: Cochlear implants (CIs) provide reasonable levels of speech recognition quietly, but voice pitch perception is severely impaired in CI users. The central question addressed here relates to how access to acoustic input pre-implantation influences vocal emotion production by individuals with CIs. The objective of this study was to compare acoustic characteristics of vocal emotions produced by prelingually deaf school-aged children with cochlear implants (CCIs) who were implanted at the age of 2 and had no usable hearing before implantation with those produced by children with normal hearing (CNH), adults with normal hearing (ANH), and postlingually deaf adults with cochlear implants (ACI) who developed with good access to acoustic information prior to losing their hearing and receiving a CI. Method: A set of 20 sentences without lexically based emotional information was recorded by 13 CCI, 9 CNH, 9 ANH, and 10 ACI, each with a happy emotion and a sad emotion, without training or guidance. The sentences were analyzed for primary acoustic characteristics of the productions. Results: Significant effects of Emotion were observed in all acoustic features analyzed (mean voice pitch, standard deviation of voice pitch, intensity, duration, and spectral centroid). ACI and ANH did not differ in any of the analyses. Of the four groups, CCI produced the smallest acoustic contrasts between the emotions in voice pitch and emotions in its standard deviation. Effects of developmental age (highly correlated with the duration of device experience) and age at implantation (moderately correlated with duration of device experience) were observed, and interactions with the children's sex were also observed. Conclusion: Although prelingually deaf CCI and postlingually deaf ACI are listening to similar degraded speech and show similar deficits in vocal emotion perception, these groups are distinct in their productions of contrastive vocal emotions. The results underscore the importance of access to acoustic hearing in early childhood for the production of speech prosody and also suggest the need for a greater role of speech therapy in this area.
Collapse
Affiliation(s)
- Monita Chatterjee
- Auditory Prostheses and Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, United States
| | | | | | | | | | | | | |
Collapse
|
30
|
Deroche MLD, Lu HP, Lin YS, Chatterjee M, Peng SC. Processing of Acoustic Information in Lexical Tone Production and Perception by Pediatric Cochlear Implant Recipients. Front Neurosci 2019; 13:639. [PMID: 31281237 PMCID: PMC6596315 DOI: 10.3389/fnins.2019.00639] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 06/03/2019] [Indexed: 11/13/2022] Open
Abstract
Purpose: This study examined the utilization of multiple types of acoustic information in lexical tone production and perception by pediatric cochlear implant (CI) recipients who are native speakers of Mandarin Chinese. Methods: Lexical tones were recorded from CI recipients and their peers with normal hearing (NH). Each participant was asked to produce a disyllabic word, yan jing, with which the first syllable was pronounced as Tone 3 (a low dipping tone) while the second syllable was pronounced as Tone 1 (a high level tone, meaning "eyes") or as Tone 4 (a high falling tone, meaning "eyeglasses"). In addition, a parametric manipulation in fundamental frequency (F0) and duration of Tones 1 and 4 used in a lexical tone recognition task in Peng et al. (2017) was adopted to evaluate the perceptual reliance on each dimension. Results: Mixed-effect analyses of duration, intensity, and F0 cues revealed that NH children focused exclusively on marking distinct F0 contours, while CI participants shortened Tone 4 or prolonged Tone 1 to enhance their contrast. In line with these production strategies, NH children relied primarily on F0 cues to identify the two tones, whereas CI children showed greater reliance on duration cues. Moreover, CI participants who placed greater perceptual weight on duration cues also tended to exhibit smaller changes in their F0 production. Conclusion: Pediatric CI recipients appear to contrast the secondary acoustic dimension (duration) in addition to F0 contours for both lexical tone production and perception. These findings suggest that perception and production strategies of lexical tones are well coupled in this pediatric CI population.
Collapse
Affiliation(s)
| | | | - Yung-Song Lin
- Chi-Mei Medical Center, Tainan, Taiwan.,Taipei Medical University, Taipei, Taiwan
| | | | - Shu-Chen Peng
- United States Food and Drug Administration, Silver Spring, MD, United States
| |
Collapse
|
31
|
Tamati TN, Janse E, Başkent D. Perceptual Discrimination of Speaking Style Under Cochlear Implant Simulation. Ear Hear 2019; 40:63-76. [PMID: 29742545 PMCID: PMC6319584 DOI: 10.1097/aud.0000000000000591] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 03/12/2018] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Real-life, adverse listening conditions involve a great deal of speech variability, including variability in speaking style. Depending on the speaking context, talkers may use a more casual, reduced speaking style or a more formal, careful speaking style. Attending to fine-grained acoustic-phonetic details characterizing different speaking styles facilitates the perception of the speaking style used by the talker. These acoustic-phonetic cues are poorly encoded in cochlear implants (CIs), potentially rendering the discrimination of speaking style difficult. As a first step to characterizing CI perception of real-life speech forms, the present study investigated the perception of different speaking styles in normal-hearing (NH) listeners with and without CI simulation. DESIGN The discrimination of three speaking styles (conversational reduced speech, speech from retold stories, and carefully read speech) was assessed using a speaking style discrimination task in two experiments. NH listeners classified sentence-length utterances, produced in one of the three styles, as either formal (careful) or informal (conversational). Utterances were presented with unmodified speaking rates in experiment 1 (31 NH, young adult Dutch speakers) and with modified speaking rates set to the average rate across all utterances in experiment 2 (28 NH, young adult Dutch speakers). In both experiments, acoustic noise-vocoder simulations of CIs were used to produce 12-channel (CI-12) and 4-channel (CI-4) vocoder simulation conditions, in addition to a no-simulation condition without CI simulation. RESULTS In both experiments 1 and 2, NH listeners were able to reliably discriminate the speaking styles without CI simulation. However, this ability was reduced under CI simulation. In experiment 1, participants showed poor discrimination of speaking styles under CI simulation. Listeners used speaking rate as a cue to make their judgements, even though it was not a reliable cue to speaking style in the study materials. In experiment 2, without differences in speaking rate among speaking styles, listeners showed better discrimination of speaking styles under CI simulation, using additional cues to complete the task. CONCLUSIONS The findings from the present study demonstrate that perceiving differences in three speaking styles under CI simulation is a difficult task because some important cues to speaking style are not fully available in these conditions. While some cues like speaking rate are available, this information alone may not always be a reliable indicator of a particular speaking style. Some other reliable speaking styles cues, such as degraded acoustic-phonetic information and variability in speaking rate within an utterance, may be available but less salient. However, as in experiment 2, listeners' perception of speaking styles may be modified if they are constrained or trained to use these additional cues, which were more reliable in the context of the present study. Taken together, these results suggest that dealing with speech variability in real-life listening conditions may be a challenge for CI users.
Collapse
Affiliation(s)
- Terrin N. Tamati
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Esther Janse
- Centre for Language Studies, Radboud University Nijmegen, Nijmegen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
32
|
The Use of Static and Dynamic Cues for Vowel Identification by Children Wearing Hearing Aids or Cochlear Implants. Ear Hear 2019; 41:72-81. [PMID: 30998549 DOI: 10.1097/aud.0000000000000735] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE To examine vowel perception based on dynamic formant transition and/or static formant pattern cues in children with hearing loss while using their hearing aids or cochlear implants. We predicted that the sensorineural hearing loss would degrade formant transitions more than static formant patterns, and that shortening the duration of cues would cause more difficulty for vowel identification for these children than for their normal-hearing peers. DESIGN A repeated-measures, between-group design was used. Children 4 to 9 years of age from a university hearing services clinic who were fit for hearing aids (13 children) or who wore cochlear implants (10 children) participated. Chronologically age-matched children with normal hearing served as controls (23 children). Stimuli included three naturally produced syllables (/ba/, /bi/, and /bu/), which were presented either in their entirety or segmented to isolate the formant transition or the vowel static formant center. The stimuli were presented to listeners via loudspeaker in the sound field. Aided participants wore their own devices and listened with their everyday settings. Participants chose the vowel presented by selecting from corresponding pictures on a computer screen. RESULTS Children with hearing loss were less able to use shortened transition or shortened vowel centers to identify vowels as compared to their normal-hearing peers. Whole syllable and initial transition yielded better identification performance than the vowel center for /ɑ/, but not for /i/ or /u/. CONCLUSIONS The children with hearing loss may require a longer time window than children with normal hearing to integrate vowel cues over time because of altered peripheral encoding in spectrotemporal domains. Clinical implications include cognizance of the importance of vowel perception when developing habilitative programs for children with hearing loss.
Collapse
|
33
|
Abstract
OBJECTIVES Sonority is the relative perceptual prominence/loudness of speech sounds of the same length, stress, and pitch. Children with cochlear implants (CIs), with restored audibility and relatively intact temporal processing, are expected to benefit from the perceptual prominence cues of highly sonorous sounds. Sonority also influences lexical access through the sonority-sequencing principle (SSP), a grammatical phonotactic rule, which facilitates the recognition and segmentation of syllables within speech. The more nonsonorous the onset of a syllable is, the larger is the degree of sonority rise to the nucleus, and the more optimal the SSP. Children with CIs may experience hindered or delayed development of the language-learning rule SSP, as a result of their deprived/degraded auditory experience. The purpose of the study was to explore sonority's role in speech perception and lexical access of prelingually deafened children with CIs. DESIGN A case-control study with 15 children with CIs, 25 normal-hearing children (NHC), and 50 normal-hearing adults was conducted, using a lexical identification task of novel, nonreal CV-CV words taught via fast mapping. The CV-CV words were constructed according to four sonority conditions, entailing syllables with sonorous onsets/less optimal SSP (SS) and nonsonorous onsets/optimal SSP (NS) in all combinations, that is, SS-SS, SS-NS, NS-SS, and NS-NS. Outcome measures were accuracy and reaction times (RTs). A subgroup analysis of 12 children with CIs pair matched to 12 NHC on hearing age aimed to study the effect of oral-language exposure period on the sonority-related performance. RESULTS The children groups showed similar accuracy performance, overall and across all the sonority conditions. However, within-group comparisons showed that the children with CIs scored more accurately on the SS-SS condition relative to the NS-NS and NS-SS conditions, while the NHC performed equally well across all conditions. Additionally, adult-comparable accuracy performance was achieved by the children with CIs only on the SS-SS condition, as opposed to NS-SS, SS-NS, and SS-SS conditions for NHC. Accuracy analysis of the subgroups of children matched in hearing age showed similar results. Overall longer RTs were recorded by the children with CIs on the sonority-treated lexical task, specifically on the SS-SS condition compared with age-matched controls. However, the subgroup analysis showed that both groups of children did not differ on RTs. CONCLUSIONS Children with CIs performed better in lexical tasks relying on the sonority perceptual prominence cues, as in SS-SS condition, than on SSP initial relying conditions as NS-NS and NS-SS. Template-driven word learning, an early word-learning strategy, appears to play a role in the lexical access of children with CIs whether matched in hearing age or not. The SS-SS condition acts as a preferred word template. The longer RTs brought about by the highly accurate SS-SS condition in children with CIs is possibly because listening becomes more effortful. The lack of RTs difference between the children groups when matched on hearing age points out the importance of oral-language exposure period as a key factor in developing the auditory processing skills.
Collapse
|
34
|
Children's Recognition of Emotional Prosody in Spectrally Degraded Speech Is Predicted by Their Age and Cognitive Status. Ear Hear 2019; 39:874-880. [PMID: 29337761 DOI: 10.1097/aud.0000000000000546] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES It is known that school-aged children with cochlear implants show deficits in voice emotion recognition relative to normal-hearing peers. Little, however, is known about normal-hearing children's processing of emotional cues in cochlear implant-simulated, spectrally degraded speech. The objective of this study was to investigate school-aged, normal-hearing children's recognition of voice emotion, and the degree to which their performance could be predicted by their age, vocabulary, and cognitive factors such as nonverbal intelligence and executive function. DESIGN Normal-hearing children (6-19 years old) and young adults were tested on a voice emotion recognition task under three different conditions of spectral degradation using cochlear implant simulations (full-spectrum, 16-channel, and 8-channel noise-vocoded speech). Measures of vocabulary, nonverbal intelligence, and executive function were obtained as well. RESULTS Adults outperformed children on all tasks, and a strong developmental effect was observed. The children's age, the degree of spectral resolution, and nonverbal intelligence were predictors of performance, but vocabulary and executive functions were not, and no interactions were observed between age and spectral resolution. CONCLUSIONS These results indicate that cognitive function and age play important roles in children's ability to process emotional prosody in spectrally degraded speech. The lack of an interaction between the degree of spectral resolution and children's age further suggests that younger and older children may benefit similarly from improvements in spectral resolution. The findings imply that younger and older children with cochlear implants may benefit similarly from technical advances that improve spectral resolution.
Collapse
|
35
|
DiNino M, Arenberg JG. Age-Related Performance on Vowel Identification and the Spectral-temporally Modulated Ripple Test in Children With Normal Hearing and With Cochlear Implants. Trends Hear 2019; 22:2331216518770959. [PMID: 29708065 PMCID: PMC5949928 DOI: 10.1177/2331216518770959] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Children’s performance on psychoacoustic tasks improves with age, but inadequate auditory input may delay this maturation. Cochlear implant (CI) users receive a degraded auditory signal with reduced frequency resolution compared with normal, acoustic hearing; thus, immature auditory abilities may contribute to the variation among pediatric CI users’ speech recognition scores. This study investigated relationships between age-related variables, spectral resolution, and vowel identification scores in prelingually deafened, early-implanted children with CIs compared with normal hearing (NH) children. All participants performed vowel identification and the Spectral-temporally Modulated Ripple Test (SMRT). Vowel stimuli for NH children were vocoded to simulate the reduced spectral resolution of CI hearing. Age positively predicted NH children’s vocoded vowel identification scores, but time with the CI was a stronger predictor of vowel recognition and SMRT performance of children with CIs. For both groups, SMRT thresholds were related to vowel identification performance, analogous to previous findings in adults. Sequential information analysis of vowel feature perception indicated greater transmission of duration-related information compared with formant features in both groups of children. In addition, the amount of F2 information transmitted predicted SMRT thresholds in children with NH and with CIs. Comparisons between the two CIs of bilaterally implanted children revealed disparate task performance levels and information transmission values within the same child. These findings indicate that adequate auditory experience contributes to auditory perceptual abilities of pediatric CI users. Further, factors related to individual CIs may be more relevant to psychoacoustic task performance than are the overall capabilities of the child.
Collapse
Affiliation(s)
- Mishaela DiNino
- 1 Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Julie G Arenberg
- 1 Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| |
Collapse
|
36
|
Gaudrain E, Başkent D. Discrimination of Voice Pitch and Vocal-Tract Length in Cochlear Implant Users. Ear Hear 2019; 39:226-237. [PMID: 28799983 PMCID: PMC5839701 DOI: 10.1097/aud.0000000000000480] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 06/29/2017] [Indexed: 12/02/2022]
Abstract
OBJECTIVES When listening to two competing speakers, normal-hearing (NH) listeners can take advantage of voice differences between the speakers. Users of cochlear implants (CIs) have difficulty in perceiving speech on speech. Previous literature has indicated sensitivity to voice pitch (related to fundamental frequency, F0) to be poor among implant users, while sensitivity to vocal-tract length (VTL; related to the height of the speaker and formant frequencies), the other principal voice characteristic, has not been directly investigated in CIs. A few recent studies evaluated F0 and VTL perception indirectly, through voice gender categorization, which relies on perception of both voice cues. These studies revealed that, contrary to prior literature, CI users seem to rely exclusively on F0 while not utilizing VTL to perform this task. The objective of the present study was to directly and systematically assess raw sensitivity to F0 and VTL differences in CI users to define the extent of the deficit in voice perception. DESIGN The just-noticeable differences (JNDs) for F0 and VTL were measured in 11 CI listeners using triplets of consonant-vowel syllables in an adaptive three-alternative forced choice method. RESULTS The results showed that while NH listeners had average JNDs of 1.95 and 1.73 semitones (st) for F0 and VTL, respectively, CI listeners showed JNDs of 9.19 and 7.19 st. These JNDs correspond to differences of 70% in F0 and 52% in VTL. For comparison to the natural range of voices in the population, the F0 JND in CIs remains smaller than the typical male-female F0 difference. However, the average VTL JND in CIs is about twice as large as the typical male-female VTL difference. CONCLUSIONS These findings, thus, directly confirm that CI listeners do not seem to have sufficient access to VTL cues, likely as a result of limited spectral resolution, and, hence, that CI listeners' voice perception deficit goes beyond poor perception of F0. These results provide a potential common explanation not only for a number of deficits observed in CI listeners, such as voice identification and gender categorization, but also for competing speech perception.
Collapse
Affiliation(s)
- Etienne Gaudrain
- University of Groningen, University Medical Center Groningen, Department of Otorhinolaryngology-Head and Neck Surgery, Groningen, The Netherlands; CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université Lyon, Lyon, France; and Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- University of Groningen, University Medical Center Groningen, Department of Otorhinolaryngology-Head and Neck Surgery, Groningen, The Netherlands; CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université Lyon, Lyon, France; and Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
37
|
Feng L, Oxenham AJ. Spectral contrast effects produced by competing speech contexts. J Exp Psychol Hum Percept Perform 2018; 44:1447-1457. [PMID: 29847973 PMCID: PMC6110988 DOI: 10.1037/xhp0000546] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The long-term spectrum of a preceding sentence can alter the perception of a following speech sound in a contrastive manner. This speech context effect contributes to our ability to extract reliable spectral characteristics of the surrounding acoustic environment and to compensate for the voice characteristics of different speakers or spectral colorations in different listening environments to maintain perceptual constancy. The extent to which such effects are mediated by low-level "automatic" processes, or require directed attention, remains unknown. This study investigated spectral context effects by measuring the effects of two competing sentences on the phoneme category boundary between /i/ and /ε/ in a following target word, while directing listeners' attention to one or the other context sentence. Spatial separation of the context sentences was achieved either by presenting them to different ears, or by presenting them to both ears but imposing an interaural time difference (ITD) between the ears. The results confirmed large context effects based on ear of presentation. Smaller effects were observed based on either ITD or attention. The results, combined with predictions from a two-stage model, suggest that ear-specific factors dominate speech context effects but that the effects can be modulated by higher-level features, such as perceived location, and by attention. (PsycINFO Database Record
Collapse
Affiliation(s)
- Lei Feng
- Department of Psychology, University of Minnesota
| | | |
Collapse
|
38
|
Age-Related Differences in the Processing of Temporal Envelope and Spectral Cues in a Speech Segment. Ear Hear 2018; 38:e335-e342. [PMID: 28562426 DOI: 10.1097/aud.0000000000000447] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES As people age, they experience reduced temporal processing abilities. This results in poorer ability to understand speech, particularly for degraded input signals. Cochlear implants (CIs) convey speech information via the temporal envelopes of a spectrally degraded input signal. Because there is an increasing number of older CI users, there is a need to understand how temporal processing changes with age. Therefore, the goal of this study was to quantify age-related reduction in temporal processing abilities when attempting to discriminate words based on temporal envelope information from spectrally degraded signals. DESIGN Younger normal-hearing (YNH) and older normal-hearing (ONH) participants were presented a continuum of speech tokens that varied in silence duration between phonemes (0 to 60 ms in 10-ms steps), and were asked to identify whether the stimulus was perceived more as the word "dish" or "ditch." Stimuli were vocoded using tonal carriers. The number of channels (1, 2, 4, 8, 16, and unprocessed) and temporal envelope low-pass filter cutoff frequency (50 and 400 Hz) were systematically varied. RESULTS For the unprocessed conditions, the YNH participants perceived the word ditch for smaller silence durations than the ONH participants, indicating that aging affects temporal processing abilities. There was no difference in performance between the unprocessed and 16-channel, 400-Hz vocoded stimuli. Decreasing the number of spectral channels caused decreased ability to distinguish dish and ditch. Decreasing the envelope cutoff frequency also caused decreased ability to distinguish dish and ditch. The overall pattern of results revealed that reductions in spectral and temporal information had a relatively larger effect on the ONH participants compared with the YNH participants. CONCLUSIONS Aging reduces the ability to utilize brief temporal cues in speech segments. Reducing spectral information-as occurs in a channel vocoder and in CI speech processing strategies-forces participants to use temporal envelope information; however, older participants are less capable of utilizing this information. These results suggest that providing as much spectral and temporal speech information as possible would benefit older CI users relatively more than younger CI users. In addition, the present findings help set expectations of clinical outcomes for speech understanding performance by adult CI users as a function of age.
Collapse
|
39
|
Feng L, Oxenham AJ. Effects of spectral resolution on spectral contrast effects in cochlear-implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL468. [PMID: 29960500 PMCID: PMC6002271 DOI: 10.1121/1.5042082] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 05/02/2018] [Accepted: 05/27/2018] [Indexed: 06/08/2023]
Abstract
The identity of a speech sound can be affected by the long-term spectrum of a preceding stimulus. Poor spectral resolution of cochlear implants (CIs) may affect such context effects. Here, spectral contrast effects on a phoneme category boundary were investigated in CI users and normal-hearing (NH) listeners. Surprisingly, larger contrast effects were observed in CI users than in NH listeners, even when spectral resolution in NH listeners was limited via vocoder processing. The results may reflect a different weighting of spectral cues by CI users, based on poorer spectral resolution, which in turn may enhance some spectral contrast effects.
Collapse
Affiliation(s)
- Lei Feng
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA ,
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA ,
| |
Collapse
|
40
|
Assessment of Spectral and Temporal Resolution in Cochlear Implant Users Using Psychoacoustic Discrimination and Speech Cue Categorization. Ear Hear 2018; 37:e377-e390. [PMID: 27438871 DOI: 10.1097/aud.0000000000000328] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES This study was conducted to measure auditory perception by cochlear implant users in the spectral and temporal domains, using tests of either categorization (using speech-based cues) or discrimination (using conventional psychoacoustic tests). The authors hypothesized that traditional nonlinguistic tests assessing spectral and temporal auditory resolution would correspond to speech-based measures assessing specific aspects of phonetic categorization assumed to depend on spectral and temporal auditory resolution. The authors further hypothesized that speech-based categorization performance would ultimately be a superior predictor of speech recognition performance, because of the fundamental nature of speech recognition as categorization. DESIGN Nineteen cochlear implant listeners and 10 listeners with normal hearing participated in a suite of tasks that included spectral ripple discrimination, temporal modulation detection, and syllable categorization, which was split into a spectral cue-based task (targeting the /ba/-/da/ contrast) and a timing cue-based task (targeting the /b/-/p/ and /d/-/t/ contrasts). Speech sounds were manipulated to contain specific spectral or temporal modulations (formant transitions or voice onset time, respectively) that could be categorized. Categorization responses were quantified using logistic regression to assess perceptual sensitivity to acoustic phonetic cues. Word recognition testing was also conducted for cochlear implant listeners. RESULTS Cochlear implant users were generally less successful at utilizing both spectral and temporal cues for categorization compared with listeners with normal hearing. For the cochlear implant listener group, spectral ripple discrimination was significantly correlated with the categorization of formant transitions; both were correlated with better word recognition. Temporal modulation detection using 100- and 10-Hz-modulated noise was not correlated either with the cochlear implant subjects' categorization of voice onset time or with word recognition. Word recognition was correlated more closely with categorization of the controlled speech cues than with performance on the psychophysical discrimination tasks. CONCLUSIONS When evaluating people with cochlear implants, controlled speech-based stimuli are feasible to use in tests of auditory cue categorization, to complement traditional measures of auditory discrimination. Stimuli based on specific speech cues correspond to counterpart nonlinguistic measures of discrimination, but potentially show better correspondence with speech perception more generally. The ubiquity of the spectral (formant transition) and temporal (voice onset time) stimulus dimensions across languages highlights the potential to use this testing approach even in cases where English is not the native language.
Collapse
|
41
|
Huang YT, Newman RS, Catalano A, Goupell MJ. Using prosody to infer discourse prominence in cochlear-implant users and normal-hearing listeners. Cognition 2017; 166:184-200. [PMID: 28578222 DOI: 10.1016/j.cognition.2017.05.029] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Revised: 05/09/2017] [Accepted: 05/19/2017] [Indexed: 11/16/2022]
Abstract
Cochlear implants (CIs) provide speech perception to adults with severe-to-profound hearing loss, but the acoustic signal remains severely degraded. Limited access to pitch cues is thought to decrease sensitivity to prosody in CI users, but co-occurring changes in intensity and duration may provide redundant cues. The current study investigates how listeners use these cues to infer discourse prominence. CI users and normal-hearing (NH) listeners were presented with sentences varying in prosody (accented vs. unaccented words) while their eye-movements were measured to referents varying in discourse status (given vs. new categories). In Experiment 1, all listeners inferred prominence when prosody on nouns distinguished categories ("SANDWICH"→not sandals). In Experiment 2, CI users and NH listeners presented with natural speech inferred prominence when prosody on adjectives implied contrast across both categories and properties ("PINK horse"→not the orange horse). In contrast, NH listeners presented with simulated CI (vocoded) speech were sensitive to acoustic differences in prosody, but did not use these cues to infer discourse status. Together, this suggests that exploiting redundant cues for comprehension varies with the demands of language processing and prior experience with the degraded signal.
Collapse
Affiliation(s)
- Yi Ting Huang
- University of Maryland, College Park, United States.
| | | | | | | |
Collapse
|
42
|
Stilp CE. Acoustic Context Alters Vowel Categorization in Perception of Noise-Vocoded Speech. J Assoc Res Otolaryngol 2017; 18:465-481. [PMID: 28281035 PMCID: PMC5418160 DOI: 10.1007/s10162-017-0615-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Accepted: 01/30/2017] [Indexed: 10/20/2022] Open
Abstract
Normal-hearing listeners' speech perception is widely influenced by spectral contrast effects (SCEs), where perception of a given sound is biased away from stable spectral properties of preceding sounds. Despite this influence, it is not clear how these contrast effects affect speech perception for cochlear implant (CI) users whose spectral resolution is notoriously poor. This knowledge is important for understanding how CIs might better encode key spectral properties of the listening environment. Here, SCEs were measured in normal-hearing listeners using noise-vocoded speech to simulate poor spectral resolution. Listeners heard a noise-vocoded sentence where low-F1 (100-400 Hz) or high-F1 (550-850 Hz) frequency regions were amplified to encourage "eh" (/ɛ/) or "ih" (/ɪ/) responses to the following target vowel, respectively. This was done by filtering with +20 dB (experiment 1a) or +5 dB gain (experiment 1b) or filtering using 100 % of the difference between spectral envelopes of /ɛ/ and /ɪ/ endpoint vowels (experiment 2a) or only 25 % of this difference (experiment 2b). SCEs influenced identification of noise-vocoded vowels in each experiment at every level of spectral resolution. In every case but one, SCE magnitudes exceeded those reported for full-spectrum speech, particularly when spectral peaks in the preceding sentence were large (+20 dB gain, 100 % of the spectral envelope difference). Even when spectral resolution was insufficient for accurate vowel recognition, SCEs were still evident. Results are suggestive of SCEs influencing CI users' speech perception as well, encouraging further investigation of CI users' sensitivity to acoustic context.
Collapse
Affiliation(s)
- Christian E Stilp
- University of Louisville, 317 Life Sciences Building, Louisville, KY, 40292, USA.
| |
Collapse
|
43
|
Peng SC, Lu HP, Lu N, Lin YS, Deroche MLD, Chatterjee M. Processing of Acoustic Cues in Lexical-Tone Identification by Pediatric Cochlear-Implant Recipients. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:1223-1235. [PMID: 28388709 PMCID: PMC5755546 DOI: 10.1044/2016_jslhr-s-16-0048] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Revised: 07/19/2016] [Accepted: 10/27/2016] [Indexed: 05/23/2023]
Abstract
PURPOSE The objective was to investigate acoustic cue processing in lexical-tone recognition by pediatric cochlear-implant (CI) recipients who are native Mandarin speakers. METHOD Lexical-tone recognition was assessed in pediatric CI recipients and listeners with normal hearing (NH) in 2 tasks. In Task 1, participants identified naturally uttered words that were contrastive in lexical tones. For Task 2, a disyllabic word (yanjing) was manipulated orthogonally, varying in fundamental-frequency (F0) contours and duration patterns. Participants identified each token with the second syllable jing pronounced with Tone 1 (a high level tone) as eyes or with Tone 4 (a high falling tone) as eyeglasses. RESULTS CI participants' recognition accuracy was significantly lower than NH listeners' in Task 1. In Task 2, CI participants' reliance on F0 contours was significantly less than that of NH listeners; their reliance on duration patterns, however, was significantly higher than that of NH listeners. Both CI and NH listeners' performance in Task 1 was significantly correlated with their reliance on F0 contours in Task 2. CONCLUSION For pediatric CI recipients, lexical-tone recognition using naturally uttered words is primarily related to their reliance on F0 contours, although duration patterns may be used as an additional cue.
Collapse
Affiliation(s)
- Shu-Chen Peng
- Center for Devices and Radiological Health, United States Food and Drug Administration, Silver Spring, MD
| | | | - Nelson Lu
- Center for Devices and Radiological Health, United States Food and Drug Administration, Silver Spring, MD
| | - Yung-Song Lin
- Chi-Mei Medical Center, Tainan, Taiwan
- Taipei Medical University, Taiwan
| | | | | |
Collapse
|
44
|
Jaekel BN, Newman RS, Goupell MJ. Speech Rate Normalization and Phonemic Boundary Perception in Cochlear-Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:1398-1416. [PMID: 28395319 PMCID: PMC5580678 DOI: 10.1044/2016_jslhr-h-15-0427] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 05/04/2016] [Accepted: 10/14/2016] [Indexed: 05/29/2023]
Abstract
PURPOSE Normal-hearing (NH) listeners rate normalize, temporarily remapping phonemic category boundaries to account for a talker's speech rate. It is unknown if adults who use auditory prostheses called cochlear implants (CI) can rate normalize, as CIs transmit degraded speech signals to the auditory nerve. Ineffective adjustment to rate information could explain some of the variability in this population's speech perception outcomes. METHOD Phonemes with manipulated voice-onset-time (VOT) durations were embedded in sentences with different speech rates. Twenty-three CI and 29 NH participants performed a phoneme identification task. NH participants heard the same unprocessed stimuli as the CI participants or stimuli degraded by a sine vocoder, simulating aspects of CI processing. RESULTS CI participants showed larger rate normalization effects (6.6 ms) than the NH participants (3.7 ms) and had shallower (less reliable) category boundary slopes. NH participants showed similarly shallow slopes when presented acoustically degraded vocoded signals, but an equal or smaller rate effect in response to reductions in available spectral and temporal information. CONCLUSION CI participants can rate normalize, despite their degraded speech input, and show a larger rate effect compared to NH participants. CI participants may particularly rely on rate normalization to better maintain perceptual constancy of the speech signal.
Collapse
Affiliation(s)
- Brittany N. Jaekel
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Rochelle S. Newman
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Matthew J. Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| |
Collapse
|
45
|
Sagi E, Svirsky MA. Contribution of formant frequency information to vowel perception in steady-state noise by cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:1027. [PMID: 28253672 PMCID: PMC5392095 DOI: 10.1121/1.4976059] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Revised: 01/12/2017] [Accepted: 01/18/2017] [Indexed: 06/06/2023]
Abstract
Cochlear implant (CI) recipients have difficulty understanding speech in noise even at moderate signal-to-noise ratios. Knowing the mechanisms they use to understand speech in noise may facilitate the search for better speech processing algorithms. In the present study, a computational model is used to assess whether CI users' vowel identification in noise can be explained by formant frequency cues (F1 and F2). Vowel identification was tested with 12 unilateral CI users in quiet and in noise. Formant cues were measured from vowels in each condition, specific to each subject's speech processor. Noise distorted the location of vowels in the F2 vs F1 plane in comparison to quiet. The best fit model to subjects' data in quiet produced model predictions in noise that were within 8% of actual scores on average. Predictions in noise were much better when assuming that subjects used a priori knowledge regarding how formant information is degraded in noise (experiment 1). However, the model's best fit to subjects' confusion matrices in noise was worse than in quiet, suggesting that CI users utilize formant cues to identify vowels in noise, but to a different extent than how they identify vowels in quiet (experiment 2).
Collapse
Affiliation(s)
- Elad Sagi
- Department of Otolaryngology, New York University School of Medicine, New York, New York 10016, USA
| | - Mario A Svirsky
- Department of Otolaryngology, New York University School of Medicine, New York, New York 10016, USA
| |
Collapse
|
46
|
DiNino M, Wright RA, Winn MB, Bierer JA. Vowel and consonant confusions from spectrally manipulated stimuli designed to simulate poor cochlear implant electrode-neuron interfaces. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:4404. [PMID: 28039993 PMCID: PMC5392103 DOI: 10.1121/1.4971420] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Revised: 10/15/2016] [Accepted: 11/22/2016] [Indexed: 05/26/2023]
Abstract
Suboptimal interfaces between cochlear implant (CI) electrodes and auditory neurons result in a loss or distortion of spectral information in specific frequency regions, which likely decreases CI users' speech identification performance. This study exploited speech acoustics to model regions of distorted CI frequency transmission to determine the perceptual consequences of suboptimal electrode-neuron interfaces. Normal hearing adults identified naturally spoken vowels and consonants after spectral information was manipulated through a noiseband vocoder: either (1) low-, middle-, or high-frequency regions of information were removed by zeroing the corresponding channel outputs, or (2) the same regions were distorted by splitting filter outputs to neighboring filters. These conditions simulated the detrimental effects of suboptimal CI electrode-neuron interfaces on spectral transmission. Vowel and consonant confusion patterns were analyzed with sequential information transmission, perceptual distance, and perceptual vowel space analyses. Results indicated that both types of spectral manipulation were equally destructive. Loss or distortion of frequency information produced similar effects on phoneme identification performance and confusion patterns. Consonant error patterns were consistently based on place of articulation. Vowel confusions showed that perceptions gravitated away from the degraded frequency region in a predictable manner, indicating that vowels can probe frequency-specific regions of spectral degradations.
Collapse
Affiliation(s)
- Mishaela DiNino
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Box 354875, Seattle, Washington 98105, USA
| | - Richard A Wright
- Department of Linguistics, University of Washington, Guggenheim Hall, Box 352425, Seattle, Washington, 98195, USA
| | - Matthew B Winn
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Box 354875, Seattle, Washington 98105, USA
| | - Julie Arenberg Bierer
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Box 354875, Seattle, Washington 98105, USA
| |
Collapse
|
47
|
Winn MB. Rapid Release From Listening Effort Resulting From Semantic Context, and Effects of Spectral Degradation and Cochlear Implants. Trends Hear 2016; 20:2331216516669723. [PMID: 27698260 PMCID: PMC5051669 DOI: 10.1177/2331216516669723] [Citation(s) in RCA: 84] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 08/26/2016] [Accepted: 08/26/2016] [Indexed: 11/15/2022] Open
Abstract
People with hearing impairment are thought to rely heavily on context to compensate for reduced audibility. Here, we explore the resulting cost of this compensatory behavior, in terms of effort and the efficiency of ongoing predictive language processing. The listening task featured predictable or unpredictable sentences, and participants included people with cochlear implants as well as people with normal hearing who heard full-spectrum/unprocessed or vocoded speech. The crucial metric was the growth of the pupillary response and the reduction of this response for predictable versus unpredictable sentences, which would suggest reduced cognitive load resulting from predictive processing. Semantic context led to rapid reduction of listening effort for people with normal hearing; the reductions were observed well before the offset of the stimuli. Effort reduction was slightly delayed for people with cochlear implants and considerably more delayed for normal-hearing listeners exposed to spectrally degraded noise-vocoded signals; this pattern of results was maintained even when intelligibility was perfect. Results suggest that speed of sentence processing can still be disrupted, and exertion of effort can be elevated, even when intelligibility remains high. We discuss implications for experimental and clinical assessment of speech recognition, in which good performance can arise because of cognitive processes that occur after a stimulus, during a period of silence. Because silent gaps are not common in continuous flowing speech, the cognitive/linguistic restorative processes observed after sentences in such studies might not be available to listeners in everyday conversations, meaning that speech recognition in conventional tests might overestimate sentence-processing capability.
Collapse
Affiliation(s)
- Matthew B. Winn
- Department of Speech & Hearing Sciences, University of Washington, Seattle, WA, USA
| |
Collapse
|
48
|
Word Recognition Variability With Cochlear Implants: "Perceptual Attention" Versus "Auditory Sensitivity". Ear Hear 2016; 37:14-26. [PMID: 26301844 DOI: 10.1097/aud.0000000000000204] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
OBJECTIVES Cochlear implantation does not automatically result in robust spoken language understanding for postlingually deafened adults. Enormous outcome variability exists, related to the complexity of understanding spoken language through cochlear implants (CIs), which deliver degraded speech representations. This investigation examined variability in word recognition as explained by "perceptual attention" and "auditory sensitivity" to acoustic cues underlying speech perception. DESIGN Thirty postlingually deafened adults with CIs and 20 age-matched controls with normal hearing (NH) were tested. Participants underwent assessment of word recognition in quiet and perceptual attention (cue-weighting strategies) based on labeling tasks for two phonemic contrasts: (1) "cop"-"cob," based on a duration cue (easily accessible through CIs) or a dynamic spectral cue (less accessible through CIs), and (2) "sa"-"sha," based on static or dynamic spectral cues (both potentially poorly accessible through CIs). Participants were also assessed for auditory sensitivity to the speech cues underlying those labeling decisions. RESULTS Word recognition varied widely among CI users (20 to 96%), but it was generally poorer than for NH participants. Implant users and NH controls showed similar perceptual attention and auditory sensitivity to the duration cue, while CI users showed poorer attention and sensitivity to all spectral cues. Both attention and sensitivity to spectral cues predicted variability in word recognition. CONCLUSIONS For CI users, both perceptual attention and auditory sensitivity are important in word recognition. Efforts should be made to better represent spectral cues through implants, while also facilitating attention to these cues through auditory training.
Collapse
|
49
|
Neural Correlates of Phonetic Learning in Postlingually Deafened Cochlear Implant Listeners. Ear Hear 2016; 37:514-28. [DOI: 10.1097/aud.0000000000000287] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
50
|
Kong YY, Winn MB, Poellmann K, Donaldson GS. Discriminability and Perceptual Saliency of Temporal and Spectral Cues for Final Fricative Consonant Voicing in Simulated Cochlear-Implant and Bimodal Hearing. Trends Hear 2016; 20:20/0/2331216516652145. [PMID: 27317666 PMCID: PMC5562340 DOI: 10.1177/2331216516652145] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Multiple redundant acoustic cues can contribute to the perception of a single phonemic contrast. This study investigated the effect of spectral degradation on the discriminability and perceptual saliency of acoustic cues for identification of word-final fricative voicing in "loss" versus "laws", and possible changes that occurred when low-frequency acoustic cues were restored. Three acoustic cues that contribute to the word-final /s/-/z/ contrast (first formant frequency [F1] offset, vowel-consonant duration ratio, and consonant voicing duration) were systematically varied in synthesized words. A discrimination task measured listeners' ability to discriminate differences among stimuli within a single cue dimension. A categorization task examined the extent to which listeners make use of a given cue to label a syllable as "loss" versus "laws" when multiple cues are available. Normal-hearing listeners were presented with stimuli that were either unprocessed, processed with an eight-channel noise-band vocoder to approximate spectral degradation in cochlear implants, or low-pass filtered. Listeners were tested in four listening conditions: unprocessed, vocoder, low-pass, and a combined vocoder + low-pass condition that simulated bimodal hearing. Results showed a negative impact of spectral degradation on F1 cue discrimination and a trading relation between spectral and temporal cues in which listeners relied more heavily on the temporal cues for "loss-laws" identification when spectral cues were degraded. Furthermore, the addition of low-frequency fine-structure cues in simulated bimodal hearing increased the perceptual saliency of the F1 cue for "loss-laws" identification compared with vocoded speech. Findings suggest an interplay between the quality of sensory input and cue importance.
Collapse
Affiliation(s)
- Ying-Yee Kong
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA, USA
| | - Matthew B Winn
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Katja Poellmann
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA, USA
| | - Gail S Donaldson
- Department of Communication Sciences & Disorders, University of South Florida, Tampa, FL, USA
| |
Collapse
|