1
|
Yang J, Sidhu J, Totino G, McKim S, Xu L. Accent rating of vocoded foreign-accented speech by native listeners. JASA EXPRESS LETTERS 2023; 3:095204. [PMID: 37747319 DOI: 10.1121/10.0020989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 08/23/2023] [Indexed: 09/26/2023]
Abstract
This study examined accent rating of speech samples collected from 12 Mandarin-accented English talkers and two native English talkers. The speech samples were processed with noise- and tone-vocoders at 1, 2, 4, 8, and 16 channels. The accentedness of the vocoded and unprocessed signals was judged by 53 native English listeners on a 9-point scale. The foreign-accented talkers were judged as having a less strong accent in the vocoded conditions than in the unprocessed condition. The native talkers and foreign-accented talkers with varying degrees of accentedness demonstrated different patterns of accent rating changes as a function of the number of channels.
Collapse
Affiliation(s)
- Jing Yang
- Communication Sciences and Disorders, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin 53201, USA
| | - Jaskirat Sidhu
- Communication Sciences and Disorders, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin 53201, USA
| | - Gabrielle Totino
- Hearing, Speech and Language Sciences, Ohio University, Athens, Ohio 45701, , , , ,
| | - Sarah McKim
- Hearing, Speech and Language Sciences, Ohio University, Athens, Ohio 45701, , , , ,
| | - Li Xu
- Hearing, Speech and Language Sciences, Ohio University, Athens, Ohio 45701, , , , ,
| |
Collapse
|
2
|
Roverud E, Villard S, Kidd G. Strength of target source segregation cues affects the outcome of speech-on-speech masking experiments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2780. [PMID: 37140176 PMCID: PMC10319449 DOI: 10.1121/10.0019307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 04/11/2023] [Accepted: 04/14/2023] [Indexed: 05/05/2023]
Abstract
In speech-on-speech listening experiments, some means for designating which talker is the "target" must be provided for the listener to perform better than chance. However, the relative strength of the segregation variables designating the target could affect the results of the experiment. Here, we examine the interaction of two source segregation variables-spatial separation and talker gender differences-and demonstrate that the relative strengths of these cues may affect the interpretation of the results. Participants listened to sentence pairs spoken by different-gender target and masker talkers, presented naturally or vocoded (degrading gender cues), either colocated or spatially separated. Target and masker words were temporally interleaved to eliminate energetic masking in either an every-other-word or randomized order of presentation. Results showed that the order of interleaving had no effect on recall performance. For natural speech with strong talker gender cues, spatial separation of sources yielded no improvement in performance. For vocoded speech with degraded talker gender cues, performance improved significantly with spatial separation of sources. These findings reveal that listeners may shift among target source segregation cues contingent on cue viability. Finally, performance was poor when the target was designated after stimulus presentation, indicating strong reliance on the cues.
Collapse
Affiliation(s)
- Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Sarah Villard
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
3
|
Li MM, Moberly AC, Tamati TN. Factors affecting talker discrimination ability in adult cochlear implant users. JOURNAL OF COMMUNICATION DISORDERS 2022; 99:106255. [PMID: 35988314 PMCID: PMC10659049 DOI: 10.1016/j.jcomdis.2022.106255] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 08/10/2022] [Accepted: 08/11/2022] [Indexed: 06/15/2023]
Abstract
INTRODUCTION Real-world speech communication involves interacting with many talkers with diverse voices and accents. Many adults with cochlear implants (CIs) demonstrate poor talker discrimination, which may contribute to real-world communication difficulties. However, the factors contributing to talker discrimination ability, and how discrimination ability relates to speech recognition outcomes in adult CI users are still unknown. The current study investigated talker discrimination ability in adult CI users, and the contributions of age, auditory sensitivity, and neurocognitive skills. In addition, the relation between talker discrimination ability and multiple-talker sentence recognition was explored. METHODS Fourteen post-lingually deaf adult CI users (3 female, 11 male) with ≥1 year of CI use completed a talker discrimination task. Participants listened to two monosyllabic English words, produced by the same talker or by two different talkers, and indicated if the words were produced by the same or different talkers. Nine female and nine male native English talkers were paired, resulting in same- and different-talker pairs as well as same-gender and mixed-gender pairs. Participants also completed measures of spectro-temporal processing, neurocognitive skills, and multiple-talker sentence recognition. RESULTS CI users showed poor same-gender talker discrimination, but relatively good mixed-gender talker discrimination. Older age and weaker neurocognitive skills, in particular inhibitory control, were associated with less accurate mixed-gender talker discrimination. Same-gender discrimination was significantly related to multiple-talker sentence recognition accuracy. CONCLUSION Adult CI users demonstrate overall poor talker discrimination ability. Individual differences in mixed-gender discrimination ability were related to age and neurocognitive skills, suggesting that these factors contribute to the ability to make use of available, degraded talker characteristics. Same-gender talker discrimination was associated with multiple-talker sentence recognition, suggesting that access to subtle talker-specific cues may be important for speech recognition in challenging listening conditions.
Collapse
Affiliation(s)
- Michael M Li
- The Ohio State University Wexner Medical Center, Department of Otolaryngology - Head & Neck Surgery, Columbus, OH, USA
| | - Aaron C Moberly
- The Ohio State University Wexner Medical Center, Department of Otolaryngology - Head & Neck Surgery, Columbus, OH, USA
| | - Terrin N Tamati
- The Ohio State University Wexner Medical Center, Department of Otolaryngology - Head & Neck Surgery, Columbus, OH, USA; University Medical Center Groningen, University of Groningen, Department of Otorhinolaryngology/Head and Neck Surgery, Groningen, the Netherlands.
| |
Collapse
|
4
|
Age-Related Changes in Voice Emotion Recognition by Postlingually Deafened Listeners With Cochlear Implants. Ear Hear 2022; 43:323-334. [PMID: 34406157 PMCID: PMC8847542 DOI: 10.1097/aud.0000000000001095] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
OBJECTIVES Identification of emotional prosody in speech declines with age in normally hearing (NH) adults. Cochlear implant (CI) users have deficits in the perception of prosody, but the effects of age on vocal emotion recognition by adult postlingually deaf CI users are not known. The objective of the present study was to examine age-related changes in CI users' and NH listeners' emotion recognition. DESIGN Participants included 18 CI users (29.6 to 74.5 years) and 43 NH adults (25.8 to 74.8 years). Participants listened to emotion-neutral sentences spoken by a male and female talker in five emotions (happy, sad, scared, angry, neutral). NH adults heard them in four conditions: unprocessed (full spectrum) speech, 16-channel, 8-channel, and 4-channel noise-band vocoded speech. The adult CI users only listened to unprocessed (full spectrum) speech. Sensitivity (d') to emotions and Reaction Times were obtained using a single-interval, five-alternative, forced-choice paradigm. RESULTS For NH participants, results indicated age-related declines in Accuracy and d', and age-related increases in Reaction Time in all conditions. Results indicated an overall deficit, as well as age-related declines in overall d' for CI users, but Reaction Times were elevated compared with NH listeners and did not show age-related changes. Analysis of Accuracy scores (hit rates) were generally consistent with d' data. CONCLUSIONS Both CI users and NH listeners showed age-related deficits in emotion identification. The CI users' overall deficit in emotion perception, and their slower response times, suggest impaired social communication which may in turn impact overall well-being, particularly so for older CI users, as lower vocal emotion recognition scores have been associated with poorer subjective quality of life in CI patients.
Collapse
|
5
|
Shader MJ, Kwon BJ, Gordon-Salant S, Goupell MJ. Open-Set Phoneme Recognition Performance With Varied Temporal Cues in Younger and Older Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1196-1211. [PMID: 35133853 PMCID: PMC9150732 DOI: 10.1044/2021_jslhr-21-00299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 09/20/2021] [Accepted: 11/12/2021] [Indexed: 06/14/2023]
Abstract
PURPOSE The goal of this study was to investigate the effect of age on phoneme recognition performance in which the stimuli varied in the amount of temporal information available in the signal. Chronological age is increasingly recognized as a factor that can limit the amount of benefit an individual can receive from a cochlear implant (CI). Central auditory temporal processing deficits in older listeners may contribute to the performance gap between younger and older CI users on recognition of phonemes varying in temporal cues. METHOD Phoneme recognition was measured at three stimulation rates (500, 900, and 1800 pulses per second) and two envelope modulation frequencies (50 Hz and unfiltered) in 20 CI participants ranging in age from 27 to 85 years. Speech stimuli were multiple word pairs differing in temporal contrasts and were presented via direct stimulation of the electrode array using an eight-channel continuous interleaved sampling strategy. Phoneme recognition performance was evaluated at each stimulation rate condition using both envelope modulation frequencies. RESULTS Duration of deafness was the strongest subject-level predictor of phoneme recognition, with participants with longer durations of deafness having poorer performance overall. Chronological age did not predict performance for any stimulus condition. Additionally, duration of deafness interacted with envelope filtering. Participants with shorter durations of deafness were able to take advantage of higher frequency envelope modulations, while participants with longer durations of deafness were not. CONCLUSIONS Age did not significantly predict phoneme recognition performance. In contrast, longer durations of deafness were associated with a reduced ability to utilize available temporal information within the signal to improve phoneme recognition performance.
Collapse
Affiliation(s)
- Maureen J. Shader
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN
| | | | | | - Matthew J. Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| |
Collapse
|
6
|
Villard S, Kidd G. Speech intelligibility and talker gender classification with noise-vocoded and tone-vocoded speech. JASA EXPRESS LETTERS 2021; 1:094401. [PMID: 34590078 PMCID: PMC8456348 DOI: 10.1121/10.0006285] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 08/21/2021] [Indexed: 05/21/2023]
Abstract
Vocoded speech provides less spectral information than natural, unprocessed speech, negatively affecting listener performance on speech intelligibility and talker gender classification tasks. In this study, young normal-hearing participants listened to noise-vocoded and tone-vocoded (i.e., sinewave-vocoded) sentences containing 1, 2, 4, 8, 16, or 32 channels, as well as non-vocoded sentences, and reported the words heard as well as the gender of the talker. Overall, performance was significantly better with tone-vocoded than noise-vocoded speech for both tasks. Within the talker gender classification task, biases in performance were observed for lower numbers of channels, especially when using the noise carrier.
Collapse
Affiliation(s)
- Sarah Villard
- Department of Speech, Language and Hearing Sciences & Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, ,
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences & Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, ,
| |
Collapse
|
7
|
Shen J. Older Listeners' Perception of Speech With Strengthened and Weakened Dynamic Pitch Cues in Background Noise. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:348-358. [PMID: 33439741 PMCID: PMC8632513 DOI: 10.1044/2020_jslhr-20-00116] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 07/28/2020] [Accepted: 09/21/2020] [Indexed: 06/12/2023]
Abstract
Purpose Dynamic pitch, which is defined as the variation in fundamental frequency, is an acoustic cue that aids speech perception in noise. This study examined the effects of strengthened and weakened dynamic pitch cues on older listeners' speech perception in noise, as well as how these effects were modulated by individual factors including spectral perception ability. Method The experiment measured speech reception thresholds in noise in both younger listeners with normal hearing and older listeners whose hearing status ranged from near-normal hearing to mild-to-moderate sensorineural hearing loss. The pitch contours of the target speech were manipulated to create four levels of dynamic pitch strength: weakened, original, mildly strengthened, and strengthened. Listeners' spectral perception ability was measured using tests of spectral ripple and frequency modulation discrimination. Results Both younger and older listeners performed worse with manipulated dynamic pitch cues than with original dynamic pitch. The effects of dynamic pitch on older listeners' speech recognition were associated with their age but not with their perception of spectral information. Those older listeners who were relatively younger were more negatively affected by dynamic pitch manipulations. Conclusions The findings suggest the current pitch manipulation strategy is detrimental for older listeners to perceive speech in noise, as compared to original dynamic pitch. While the influence of age on the effects of dynamic pitch is likely due to age-related declines in pitch perception, the spectral measures used in this study were not strong predictors for dynamic pitch effects. Taken together, these results indicate next steps in this line of work should be focused on how to manipulate acoustic cues in speech in order to improve speech perception in noise for older listeners.
Collapse
Affiliation(s)
- Jing Shen
- Department of Speech, Language and Hearing Sciences, Western Michigan University, Kalamazoo
| |
Collapse
|
8
|
Xie Z, Gaskins CR, Shader MJ, Gordon-Salant S, Anderson S, Goupell MJ. Age-Related Temporal Processing Deficits in Word Segments in Adult Cochlear-Implant Users. Trends Hear 2020; 23:2331216519886688. [PMID: 31808373 PMCID: PMC6900735 DOI: 10.1177/2331216519886688] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Aging may limit speech understanding outcomes in cochlear-implant (CI) users.
Here, we examined age-related declines in auditory temporal processing as a
potential mechanism that underlies speech understanding deficits associated with
aging in CI users. Auditory temporal processing was assessed with a
categorization task for the words dish and ditch (i.e., identify each token as
the word dish or ditch) on a continuum of
speech tokens with varying silence duration (0 to 60 ms) prior to the final
fricative. In Experiments 1 and 2, younger CI (YCI), middle-aged CI (MCI), and
older CI (OCI) users participated in the categorization task across a range of
presentation levels (25 to 85 dB). Relative to YCI, OCI required longer silence
durations to identify ditch and exhibited reduced ability to distinguish the
words dish and ditch (shallower slopes in the categorization function).
Critically, we observed age-related performance differences only at higher
presentation levels. This contrasted with findings from normal-hearing listeners
in Experiment 3 that demonstrated age-related performance differences
independent of presentation level. In summary, aging in CI users appears to
degrade the ability to utilize brief temporal cues in word identification,
particularly at high levels. Age-specific CI programming may potentially improve
clinical outcomes for speech understanding performance by older CI
listeners.
Collapse
Affiliation(s)
- Zilong Xie
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Casey R Gaskins
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Maureen J Shader
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Sandra Gordon-Salant
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Samira Anderson
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, USA
| |
Collapse
|
9
|
Luo X, Kolberg C, Pulling KR, Azuma T. Psychoacoustic and Demographic Factors for Speech Recognition of Older Adult Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:1712-1725. [PMID: 32501736 DOI: 10.1044/2020_jslhr-19-00225] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose This study aimed to evaluate the effects of aging and cochlear implant (CI) on psychoacoustic and speech recognition abilities and to assess the relative contributions of psychoacoustic and demographic factors to speech recognition of older CI (OCI) users. Method Twelve OCI users, 12 older acoustic-hearing (OAH) listeners age-matched to OCI users, and 12 younger normal-hearing (YNH) listeners underwent tests of temporal amplitude modulation detection, temporal gap detection in noise, and spectral-temporal modulated ripple discrimination. Speech reception thresholds were measured for sentence recognition in multitalker, speech-babble noise. Results Statistical analyses showed that, for the small sample of OAH listeners, the degree of hearing loss did not significantly affect any outcome measure. Temporal resolution, spectral resolution, and speech recognition all significantly degraded with both age and the use of a CI (i.e., YNH better than OAH and OAH better than OCI performance). Although both were significantly correlated with OCI users' speech recognition, the duration of CI use no longer had a significant effect on speech recognition once the effect of spectral-temporal ripple discrimination performance was taken into account. For OAH listeners, the only significant predictor of speech recognition was temporal gap detection performance. Conclusion The preliminary results suggest that speech recognition of OCI users may improve with longer duration of CI use, mainly due to higher perceptual acuity to spectral-temporal modulated ripples in acoustic stimuli.
Collapse
Affiliation(s)
- Xin Luo
- College of Health Solutions, Arizona State University, Tempe
| | | | | | - Tamiko Azuma
- College of Health Solutions, Arizona State University, Tempe
| |
Collapse
|
10
|
Spectral-Temporal Trade-Off in Vocoded Sentence Recognition: Effects of Age, Hearing Thresholds, and Working Memory. Ear Hear 2020; 41:1226-1235. [PMID: 32032222 DOI: 10.1097/aud.0000000000000840] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Cochlear implant (CI) signal processing degrades the spectral components of speech. This requires CI users to rely primarily on temporal cues, specifically, amplitude modulations within the temporal envelope, to recognize speech. Auditory temporal processing ability for envelope modulations worsens with advancing age, which may put older CI users at a disadvantage compared with younger users. To evaluate how potential age-related limitations for processing temporal envelope modulations impact spectrally degraded sentence recognition, noise-vocoded sentences were presented to younger and older normal-hearing listeners in quiet. Envelope modulation rates were varied from 10 to 500 Hz by adjusting the low-pass filter cutoff frequency (LPF). The goal of this study was to evaluate if age impacts recognition of noise-vocoded speech and if this age-related limitation existed for a specific range of envelope modulation rates. DESIGN Noise-vocoded sentence recognition in quiet was measured as a function of number of spectral channels (4, 6, 8, and 12 channels) and LPF (10, 20, 50, 75, 150, 375, and 500 Hz) in 15 younger normal-hearing listeners and 15 older near-normal-hearing listeners. Hearing thresholds and working memory were assessed to determine the extent to which these factors were related to recognition of noise-vocoded sentences. RESULTS Younger listeners achieved significantly higher sentence recognition scores than older listeners overall. Performance improved in both groups as the number of spectral channels and LPF increased. As the number of spectral channels increased, the differences in sentence recognition scores between groups decreased. A spectral-temporal trade-off was observed in both groups in which performance in the 8- and 12-channel conditions plateaued with lower-frequency amplitude modulations compared with the 4- and 6-channel conditions. There was no interaction between age group and LPF, suggesting that both groups obtained similar improvements in performance with increasing LPF. The lack of an interaction between age and LPF may be due to the nature of the task of recognizing sentences in quiet. Audiometric thresholds were the only significant predictor of vocoded sentence recognition. Although performance on the working memory task declined with advancing age, working memory scores did not predict sentence recognition. CONCLUSIONS Younger listeners outperformed older listeners for recognizing noise-vocoded sentences in quiet. The negative impact of age was reduced when ample spectral information was available. Age-related limitations for recognizing vocoded sentences were not affected by the temporal envelope modulation rate of the signal, but instead, appear to be related to a generalized task limitation or to reduced audibility of the signal.
Collapse
|
11
|
Bologna WJ, Ahlstrom JB, Dubno JR. Contributions of Voice Expectations to Talker Selection in Younger and Older Adults With Normal Hearing. Trends Hear 2020; 24:2331216520915110. [PMID: 32372720 PMCID: PMC7225833 DOI: 10.1177/2331216520915110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 03/02/2020] [Accepted: 03/03/2020] [Indexed: 11/17/2022] Open
Abstract
Focused attention on expected voice features, such as fundamental frequency (F0) and spectral envelope, may facilitate segregation and selection of a target talker in competing talker backgrounds. Age-related declines in attention may limit these abilities in older adults, resulting in poorer speech understanding in complex environments. To test this hypothesis, younger and older adults with normal hearing listened to sentences with a single competing talker. For most trials, listener attention was directed to the target by a cue phrase that matched the target talker's F0 and spectral envelope. For a small percentage of randomly occurring probe trials, the target's voice unexpectedly differed from the cue phrase in terms of F0 and spectral envelope. Overall, keyword recognition for the target talker was poorer for older adults than younger adults. Keyword recognition was poorer on probe trials than standard trials for both groups, and incorrect responses on probe trials contained keywords from the single-talker masker. No interaction was observed between age-group and the decline in keyword recognition on probe trials. Thus, reduced performance by older adults overall could not be attributed to declines in attention to an expected voice. Rather, other cognitive abilities, such as speed of processing and linguistic closure, were predictive of keyword recognition for younger and older adults. Moreover, the effects of age interacted with the sex of the target talker, such that older adults had greater difficulty understanding target keywords from female talkers than male talkers.
Collapse
Affiliation(s)
- William J. Bologna
- Department of Otolaryngology—Head and Neck Surgery, Medical University of South Carolina
| | - Jayne B. Ahlstrom
- Department of Otolaryngology—Head and Neck Surgery, Medical University of South Carolina
| | - Judy R. Dubno
- Department of Otolaryngology—Head and Neck Surgery, Medical University of South Carolina
| |
Collapse
|
12
|
The Sound of a Cochlear Implant Investigated in Patients With Single-Sided Deafness and a Cochlear Implant. Otol Neurotol 2019; 39:707-714. [PMID: 29889780 DOI: 10.1097/mao.0000000000001821] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
HYPOTHESIS A cochlear implant (CI) restores hearing in patients with profound sensorineural hearing loss by electrical stimulation of the auditory nerve. It is unknown how this electrical stimulation sounds. BACKGROUND Patients with single-sided deafness (SSD) and a CI form a unique population, since they can compare the sound of their CI with simulations of the CI sound played to their nonimplanted ear. METHODS We tested six stimuli (speech and music) in 10 SSD patients implanted with a CI (Cochlear Ltd). Patients listened to the original stimulus with their CI ear while their nonimplanted ear was masked. Subsequently, patients listened to two CI simulations, created with a vocoder, with their nonimplanted ear alone. They selected the CI simulation with greatest similarity to the sound as perceived by their CI ear and they graded similarity on a 1 to 10 scale. We tested three vocoders: two known from the literature, and one supplied by Cochlear Ltd. Two carriers (noise, sine) were tested for each vocoder. RESULTS Carrier noise and the vocoders from the literature were most often selected as best match to the sound as perceived by the CI ear. However, variability in selections was substantial both between patients and within patients between sound samples. The average grade for similarity was 6.8 for speech stimuli and 6.3 for music stimuli. CONCLUSION We obtained a fairly good impression of what a CI can sound like for SSD patients. This may help to better inform and educate patients and family members about the sound of a CI.
Collapse
|
13
|
Reducing Simulated Channel Interaction Reveals Differences in Phoneme Identification Between Children and Adults With Normal Hearing. Ear Hear 2019; 40:295-311. [PMID: 29927780 DOI: 10.1097/aud.0000000000000615] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Channel interaction, the stimulation of overlapping populations of auditory neurons by distinct cochlear implant (CI) channels, likely limits the speech perception performance of CI users. This study examined the role of vocoder-simulated channel interaction in the ability of children with normal hearing (cNH) and adults with normal hearing (aNH) to recognize spectrally degraded speech. The primary aim was to determine the interaction between number of processing channels and degree of simulated channel interaction on phoneme identification performance as a function of age for cNH and to relate those findings to aNH and to CI users. DESIGN Medial vowel and consonant identification of cNH (age 8-17 years) and young aNH were assessed under six (for children) or nine (for adults) different conditions of spectral degradation. Stimuli were processed using a noise-band vocoder with 8, 12, and 15 channels and synthesis filter slopes of 15 (aNH only), 30, and 60 dB/octave (all NH subjects). Steeper filter slopes (larger numbers) simulated less electrical current spread and, therefore, less channel interaction. Spectrally degraded performance of the NH listeners was also compared with the unprocessed phoneme identification of school-aged children and adults with CIs. RESULTS Spectrally degraded phoneme identification improved as a function of age for cNH. For vowel recognition, cNH exhibited an interaction between the number of processing channels and vocoder filter slope, whereas aNH did not. Specifically, for cNH, increasing the number of processing channels only improved vowel identification in the steepest filter slope condition. Additionally, cNH were more sensitive to changes in filter slope. As the filter slopes increased, cNH continued to receive vowel identification benefit beyond where aNH performance plateaued or reached ceiling. For all NH participants, consonant identification improved with increasing filter slopes but was unaffected by the number of processing channels. Although cNH made more phoneme identification errors overall, their phoneme error patterns were similar to aNH. Furthermore, consonant identification of adults with CI was comparable to aNH listening to simulations with shallow filter slopes (15 dB/octave). Vowel identification of earlier-implanted pediatric ears was better than that of later-implanted ears and more comparable to cNH listening in conditions with steep filter slopes (60 dB/octave). CONCLUSIONS Recognition of spectrally degraded phonemes improved when simulated channel interaction was reduced, particularly for children. cNH showed an interaction between number of processing channels and filter slope for vowel identification. The differences observed between cNH and aNH suggest that identification of spectrally degraded phonemes continues to improve through adolescence and that children may benefit from reduced channel interaction beyond where adult performance has plateaued. Comparison to CI users suggests that early implantation may facilitate development of better phoneme discrimination.
Collapse
|
14
|
Casserly ED, Krizmanich T, Drews H. The Viability of Media Interviews as Materials for Auditory Training. Am J Audiol 2019; 28:376-383. [PMID: 31084572 DOI: 10.1044/2019_aja-18-0182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Purpose Rehabilitative auditory training for people with hearing loss faces 2 primary challenges: generalization of learning to novel contexts and user adherence to training goals. We hypothesized that using interview excerpts from popular media as training materials would have the potential to positively influence both of these areas. Interviews contain predictable, structured complexity that promotes perceptual generalization and are also designed to be engaging for consumers. This study tested the viability of such popular media interviews as training materials, comparing their effectiveness to that obtained with sentence transcription training. Method Young adults with normal hearing ( N = 60) completed 1 hr of transcription training using noise-vocoded materials, simulating acoustic perception through an 8-channel cochlear implant. Participants completed pre- and posttraining assessments of vocoded speech perception in quiet and in noise, along with posttraining high-variability sentence recognition and cued isolated word recognition. Scores on all tests were compared across 4 randomly assigned groups differing in training materials: audiovisual interviews, audio-only interviews, isolated sentences, and undegraded isolated sentences (providing an untrained control comparison group). Results Recognition in quiet and in noise improved with both types of interview-based training, and interview training groups outperformed the control group on all generalization tests. Participants in the audiovisual interview group also reported significantly higher, more sustained engagement in a retrospective survey. Conclusions Media interviews appear to be at least as effective as isolated sentences for transcription-based auditory training in simulated hearing loss settings with young adults and may improve engagement and generalization of benefit in auditory training applications.
Collapse
Affiliation(s)
| | | | - Hunter Drews
- Department of Psychology, Trinity College, Hartford, CT
| |
Collapse
|
15
|
Age-Related Differences in the Processing of Temporal Envelope and Spectral Cues in a Speech Segment. Ear Hear 2018; 38:e335-e342. [PMID: 28562426 DOI: 10.1097/aud.0000000000000447] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES As people age, they experience reduced temporal processing abilities. This results in poorer ability to understand speech, particularly for degraded input signals. Cochlear implants (CIs) convey speech information via the temporal envelopes of a spectrally degraded input signal. Because there is an increasing number of older CI users, there is a need to understand how temporal processing changes with age. Therefore, the goal of this study was to quantify age-related reduction in temporal processing abilities when attempting to discriminate words based on temporal envelope information from spectrally degraded signals. DESIGN Younger normal-hearing (YNH) and older normal-hearing (ONH) participants were presented a continuum of speech tokens that varied in silence duration between phonemes (0 to 60 ms in 10-ms steps), and were asked to identify whether the stimulus was perceived more as the word "dish" or "ditch." Stimuli were vocoded using tonal carriers. The number of channels (1, 2, 4, 8, 16, and unprocessed) and temporal envelope low-pass filter cutoff frequency (50 and 400 Hz) were systematically varied. RESULTS For the unprocessed conditions, the YNH participants perceived the word ditch for smaller silence durations than the ONH participants, indicating that aging affects temporal processing abilities. There was no difference in performance between the unprocessed and 16-channel, 400-Hz vocoded stimuli. Decreasing the number of spectral channels caused decreased ability to distinguish dish and ditch. Decreasing the envelope cutoff frequency also caused decreased ability to distinguish dish and ditch. The overall pattern of results revealed that reductions in spectral and temporal information had a relatively larger effect on the ONH participants compared with the YNH participants. CONCLUSIONS Aging reduces the ability to utilize brief temporal cues in speech segments. Reducing spectral information-as occurs in a channel vocoder and in CI speech processing strategies-forces participants to use temporal envelope information; however, older participants are less capable of utilizing this information. These results suggest that providing as much spectral and temporal speech information as possible would benefit older CI users relatively more than younger CI users. In addition, the present findings help set expectations of clinical outcomes for speech understanding performance by adult CI users as a function of age.
Collapse
|
16
|
Lortie CL, Deschamps I, Guitton MJ, Tremblay P. Age Differences in Voice Evaluation: From Auditory-Perceptual Evaluation to Social Interactions. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:227-245. [PMID: 29396575 DOI: 10.1044/2017_jslhr-s-16-0202] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Accepted: 10/15/2017] [Indexed: 06/07/2023]
Abstract
PURPOSE The factors that influence the evaluation of voice in adulthood, as well as the consequences of such evaluation on social interactions, are not well understood. Here, we examined the effect of listeners' age and the effect of talker age, sex, and smoking status on the auditory-perceptual evaluation of voice, voice-related psychosocial attributions, and perceived speech tempo. We also examined the voice dimensions affecting the propensity to engage in social interactions. METHOD Twenty-five younger (age 19-37 years) and 25 older (age 51-74 years) healthy adults participated in this cross-sectional study. Their task was to evaluate the voice of 80 talkers. RESULTS Statistical analyses revealed limited effects of the age of the listener on voice evaluation. Specifically, older listeners provided relatively more favorable voice ratings than younger listeners, mainly in terms of roughness. In contrast, the age of the talker had a broader impact on voice evaluation, affecting auditory-perceptual evaluations, psychosocial attributions, and perceived speech tempo. Some of these talker differences were dependent upon the sex of the talker and his or her smoking status. Finally, the results also show that voice-related psychosocial attribution was more strongly associated with the propensity of the listener to engage in social interactions with a person than auditory-perceptual dimensions and perceived speech tempo, especially for the younger adults. CONCLUSIONS These results suggest that age has a broad influence on voice evaluation, with a stronger impact for talker age compared with listener age. While voice-related psychosocial attributions may be an important determinant of social interactions, perceived voice quality and speech tempo appear to be less influential. SUPPLEMENTAL MATERIALS https://doi.org/10.23641/asha.5844102.
Collapse
Affiliation(s)
- Catherine L Lortie
- Département de Réadaptation, Faculté de Médecine, Université Laval, Quebec City, Canada
- Département d'Ophtalmologie, d'Otorhinolaryngologie et de Chirurgie Cervico-Faciale, Faculté de Médecine, Université Laval, Quebec City, Canada
- CERVO Brain Research Centre, Quebec City, Canada
| | - Isabelle Deschamps
- Département de Réadaptation, Faculté de Médecine, Université Laval, Quebec City, Canada
- CERVO Brain Research Centre, Quebec City, Canada
| | - Matthieu J Guitton
- Département d'Ophtalmologie, d'Otorhinolaryngologie et de Chirurgie Cervico-Faciale, Faculté de Médecine, Université Laval, Quebec City, Canada
- CERVO Brain Research Centre, Quebec City, Canada
| | - Pascale Tremblay
- Département de Réadaptation, Faculté de Médecine, Université Laval, Quebec City, Canada
- CERVO Brain Research Centre, Quebec City, Canada
| |
Collapse
|
17
|
Best V, Ahlstrom JB, Mason CR, Roverud E, Perrachione TK, Kidd G, Dubno JR. Talker identification: Effects of masking, hearing loss, and age. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:1085. [PMID: 29495693 PMCID: PMC5820061 DOI: 10.1121/1.5024333] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 01/24/2018] [Accepted: 01/29/2018] [Indexed: 06/08/2023]
Abstract
The ability to identify who is talking is an important aspect of communication in social situations and, while empirical data are limited, it is possible that a disruption to this ability contributes to the difficulties experienced by listeners with hearing loss. In this study, talker identification was examined under both quiet and masked conditions. Subjects were grouped by hearing status (normal hearing/sensorineural hearing loss) and age (younger/older adults). Listeners first learned to identify the voices of four same-sex talkers in quiet, and then talker identification was assessed (1) in quiet, (2) in speech-shaped, steady-state noise, and (3) in the presence of a single, unfamiliar same-sex talker. Both younger and older adults with hearing loss, as well as older adults with normal hearing, generally performed more poorly than younger adults with normal hearing, although large individual differences were observed in all conditions. Regression analyses indicated that both age and hearing loss were predictors of performance in quiet, and there was some evidence for an additional contribution of hearing loss in the presence of masking. These findings suggest that both hearing loss and age may affect the ability to identify talkers in "cocktail party" situations.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Jayne B Ahlstrom
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, Charleston, South Carolina 29425, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Tyler K Perrachione
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, Charleston, South Carolina 29425, USA
| |
Collapse
|
18
|
Sullivan JR, Assmann PF, Hossain S, Schafer EC. Voice gender and the segregation of competing talkers: Perceptual learning in cochlear implant simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:1643. [PMID: 28372046 PMCID: PMC5346103 DOI: 10.1121/1.4976002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2014] [Revised: 12/26/2016] [Accepted: 01/27/2017] [Indexed: 06/07/2023]
Abstract
Two experiments explored the role of differences in voice gender in the recognition of speech masked by a competing talker in cochlear implant simulations. Experiment 1 confirmed that listeners with normal hearing receive little benefit from differences in voice gender between a target and masker sentence in four- and eight-channel simulations, consistent with previous findings that cochlear implants deliver an impoverished representation of the cues for voice gender. However, gender differences led to small but significant improvements in word recognition with 16 and 32 channels. Experiment 2 assessed the benefits of perceptual training on the use of voice gender cues in an eight-channel simulation. Listeners were assigned to one of four groups: (1) word recognition training with target and masker differing in gender; (2) word recognition training with same-gender target and masker; (3) gender recognition training; or (4) control with no training. Significant improvements in word recognition were observed from pre- to post-test sessions for all three training groups compared to the control group. These improvements were maintained at the late session (one week following the last training session) for all three groups. There was an overall improvement in masked word recognition performance provided by gender mismatch following training, but the amount of benefit did not differ as a function of the type of training. The training effects observed here are consistent with a form of rapid perceptual learning that contributes to the segregation of competing voices but does not specifically enhance the benefits provided by voice gender cues.
Collapse
Affiliation(s)
- Jessica R Sullivan
- Department of Communication Sciences & Professional Counseling, University of West Georgia, Carrollton, Georgia 30118, USA
| | - Peter F Assmann
- School of Behavioral and Brain Sciences, University of Texas at Dallas, Richardson, Texas 75083, USA
| | - Shaikat Hossain
- School of Behavioral and Brain Sciences, University of Texas at Dallas, Richardson, Texas 75083, USA
| | - Erin C Schafer
- College of Health and Public Service, University of North Texas, Denton, Texas 76203, USA
| |
Collapse
|
19
|
Fogerty D, Bologna WJ, Ahlstrom JB, Dubno JR. Simultaneous and forward masking of vowels and stop consonants: Effects of age, hearing loss, and spectral shaping. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:1133. [PMID: 28253707 PMCID: PMC5848836 DOI: 10.1121/1.4976082] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 01/11/2017] [Accepted: 01/14/2017] [Indexed: 05/13/2023]
Abstract
Fluctuating noise, common in everyday environments, has the potential to mask acoustic cues important for speech recognition. This study examined the extent to which acoustic cues for perception of vowels and stop consonants differ in their susceptibility to simultaneous and forward masking. Younger normal-hearing, older normal-hearing, and older hearing-impaired adults identified initial and final consonants or vowels in noise-masked syllables that had been spectrally shaped. The amount of shaping was determined by subjects' audiometric thresholds. A second group of younger adults with normal hearing was tested with spectral shaping determined by the mean audiogram of the hearing-impaired group. Stimulus timing ensured that the final 10, 40, or 100 ms of the syllable occurred after the masker offset. Results demonstrated that participants benefited from short temporal delays between the noise and speech for vowel identification, but required longer delays for stop consonant identification. Older adults with normal and impaired hearing, with sufficient audibility, required longer delays to obtain performance equivalent to that of the younger adults. Overall, these results demonstrate that in forward masking conditions, younger listeners can successfully identify vowels during short temporal intervals (i.e., one unmasked pitch period), with longer durations required for consonants and for older adults.
Collapse
Affiliation(s)
- Daniel Fogerty
- Department of Communication Sciences and Disorders, University of South Carolina, 1224 Sumter Street, Suite 300, Columbia, South Carolina 29208, USA
| | - William J Bologna
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425-5500, USA
| | - Jayne B Ahlstrom
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425-5500, USA
| | - Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425-5500, USA
| |
Collapse
|
20
|
Herrmann B, Parthasarathy A, Bartlett EL. Ageing affects dual encoding of periodicity and envelope shape in rat inferior colliculus neurons. Eur J Neurosci 2017; 45:299-311. [PMID: 27813207 PMCID: PMC5247336 DOI: 10.1111/ejn.13463] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2016] [Revised: 10/19/2016] [Accepted: 10/31/2016] [Indexed: 11/27/2022]
Abstract
Extracting temporal periodicities and envelope shapes of sounds is important for listening within complex auditory scenes but declines behaviorally with age. Here, we recorded local field potentials (LFPs) and spikes to investigate how ageing affects the neural representations of different modulation rates and envelope shapes in the inferior colliculus of rats. We specifically aimed to explore the input-output (LFP-spike) response transformations of inferior colliculus neurons. Our results show that envelope shapes up to 256-Hz modulation rates are represented in the neural synchronisation phase lags in younger and older animals. Critically, ageing was associated with (i) an enhanced gain in onset response magnitude from LFPs to spikes; (ii) an enhanced gain in neural synchronisation strength from LFPs to spikes for a low modulation rate (45 Hz); (iii) a decrease in LFP synchronisation strength for higher modulation rates (128 and 256 Hz) and (iv) changes in neural synchronisation strength to different envelope shapes. The current age-related changes are discussed in the context of an altered excitation-inhibition balance accompanying ageing.
Collapse
Affiliation(s)
- Björn Herrmann
- Department of Psychology & Brain and Mind Institute, The University of Western Ontario, London, ON, N6A 3K7, Canada
| | - Aravindakshan Parthasarathy
- Depts. of Biological Sciences and Biomedical Engineering, Purdue University, West Lafayette, IN, 47906, USA
- Dept. of Otology and Laryngology, Harvard Medical School, and Eaton-Peabody Laboratories, Massachusetts Eye and Ear Infirmary, Boston, MA 02114
| | - Edward L. Bartlett
- Depts. of Biological Sciences and Biomedical Engineering, Purdue University, West Lafayette, IN, 47906, USA
| |
Collapse
|
21
|
Casserly ED, Barney EC. Auditory Training With Multiple Talkers and Passage-Based Semantic Cohesion. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:159-171. [PMID: 28002542 DOI: 10.1044/2016_jslhr-h-15-0357] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 06/13/2016] [Indexed: 06/06/2023]
Abstract
PURPOSE Current auditory training methods typically result in improvements to speech recognition abilities in quiet, but learner gains may not extend to other domains in speech (e.g., recognition in noise) or self-assessed benefit. This study examined the potential of training involving multiple talkers and training emphasizing discourse-level top-down processing to produce more generalized learning. METHOD Normal-hearing participants (N = 64) were randomly assigned to 1 of 4 auditory training protocols using noise-vocoded speech simulating the processing of an 8-channel cochlear implant: sentence-based single-talker training, training with 24 different talkers, passage-based transcription training, and a control (transcribing unvocoded sentence materials). In all cases, participants completed 2 pretests under cochlear implant simulation, 1 hr of training, and 5 posttests to assess perceptual learning and cross-context generalization. RESULTS Performance above the control was seen in all 3 experimental groups for sentence recognition in quiet. In addition, the multitalker training method generalized to a context word-recognition task, and the passage training method caused gains in sentence recognition in noise. CONCLUSION The gains of the multitalker and passage training groups over the control suggest that, with relatively small modifications, improvements to the generalized outcomes of auditory training protocols may be possible.
Collapse
Affiliation(s)
| | - Erin C Barney
- Department of Psychology, Trinity College, Hartford, CT
| |
Collapse
|
22
|
Meister H, Fürsen K, Streicher B, Lang-Roth R, Walger M. The Use of Voice Cues for Speaker Gender Recognition in Cochlear Implant Recipients. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2016; 59:546-556. [PMID: 27135985 DOI: 10.1044/2015_jslhr-h-15-0128] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Accepted: 09/23/2015] [Indexed: 06/05/2023]
Abstract
PURPOSE The focus of this study was to examine the influence of fundamental frequency (F0) and vocal tract length (VTL) modifications on speaker gender recognition in cochlear implant (CI) recipients for different stimulus types. METHOD Single words and sentences were manipulated using isolated or combined F0 and VTL cues. Using an 11-point rating scale, CI recipients and listeners with normal hearing rated the maleness/femaleness of the corresponding voice. RESULTS Speaker gender ratings for combined F0 and VTL modifications were similar across all stimulus types in both CI recipients and listeners with normal hearing, although the CI recipients showed a somewhat larger ambiguity. In contrast to listeners with normal hearing, F0-VTL and F0-only modifications revealed similar ratings in the CI recipients when using words as stimuli. However, when sentences were used, a difference was found between F0-VTL-based and F0-based ratings. Modifying VTL cues alone did not affect ratings in the CI group. CONCLUSIONS Whereas speaker gender ratings by listeners with normal hearing relied on combined VTL and F0 cues, CI recipients made only limited use of VTL cues, which might be one reason behind problems with identifying the speaker on the basis of voice. However, use of the voice cues depended on stimulus type, with the greater information in sentences allowing a more detailed analysis than single words in both listener groups.
Collapse
|
23
|
Donai JJ, Jennings MB. Gaps-in-noise detection and gender identification from noise-vocoded vowel segments: Comparing performance of active musicians to non-musicians. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:EL128. [PMID: 27250197 DOI: 10.1121/1.4947070] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This study evaluated performance on a gender identification and temporal resolution task among active musicians and age-matched non-musicians. Brief duration (i.e., 50 and 100 ms) vowel segments produced by four adult male and four adult female speakers were spectro-temporally degraded using various parameters and presented to both groups for gender identification. Gap detection thresholds were measured using the gaps-in-noise (GIN) test. Contrary to the stated hypothesis, a significant difference in gender identification was not observed between the musician and non-musician listeners. A significant difference, however, was observed on the temporal resolution task, with the musician group achieving approximately 2 ms shorter gap detection thresholds on the GIN test compared to the non-musician counterparts. These results provide evidence supporting the potential benefits of musical training on temporal processing abilities, which have implications for the processing of speech in degraded listening environments and the enhanced processing of the fine-grained temporal aspects of the speech signal. The results also support the GIN test as an instrument sensitive to temporal processing differences among active musicians and non-musicians.
Collapse
Affiliation(s)
- Jeremy J Donai
- Department of Communication Sciences and Disorders, West Virginia University, Morgantown, West Virginia 26506, USA ,
| | - Mariah B Jennings
- Department of Communication Sciences and Disorders, West Virginia University, Morgantown, West Virginia 26506, USA ,
| |
Collapse
|
24
|
Schvartz-Leyzac KC, Chatterjee M. Fundamental-frequency discrimination using noise-band-vocoded harmonic complexes in older listeners with normal hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:1687-1695. [PMID: 26428806 PMCID: PMC4592424 DOI: 10.1121/1.4929938] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Revised: 08/06/2015] [Accepted: 08/20/2015] [Indexed: 06/04/2023]
Abstract
Voice-pitch cues provide detailed information about a talker that help a listener to understand speech in complex environments. Temporal-envelope based voice-pitch coding is important for listeners with hearing impairment, especially listeners with cochlear implants, as spectral resolution is not sufficient to provide a spectrally based voice-pitch cue. The effect of aging on the ability to glean voice-pitch information using temporal envelope cues is not completely understood. The current study measured fundamental frequency (f0) discrimination limens in normal-hearing younger and older adults while listening to noise-band vocoded harmonic complexes with varying numbers of spectral channels. Age-related disparities in performance were apparent across all conditions, independent of spectral degradation and/or fundamental frequency. The findings have important implications for older listeners with normal hearing and hearing loss, who may be inherently limited in their ability to perceive f0 cues due to senescent decline in auditory function.
Collapse
Affiliation(s)
- Kara C Schvartz-Leyzac
- Department of Hearing and Speech Sciences, University of Maryland, 0100 LeFrak Hall, College Park, Maryland 20742, USA
| | - Monita Chatterjee
- Department of Hearing and Speech Sciences, University of Maryland, 0100 LeFrak Hall, College Park, Maryland 20742, USA
| |
Collapse
|
25
|
Gender identification from high-pass filtered vowel segments: The use of high-frequency energy. Atten Percept Psychophys 2015; 77:2452-62. [DOI: 10.3758/s13414-015-0945-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
26
|
Chatterjee M, Zion DJ, Deroche ML, Burianek BA, Limb CJ, Goren AP, Kulkarni AM, Christensen JA. Voice emotion recognition by cochlear-implanted children and their normally-hearing peers. Hear Res 2015; 322:151-62. [PMID: 25448167 PMCID: PMC4615700 DOI: 10.1016/j.heares.2014.10.003] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2014] [Revised: 08/27/2014] [Accepted: 10/06/2014] [Indexed: 10/24/2022]
Abstract
Despite their remarkable success in bringing spoken language to hearing impaired listeners, the signal transmitted through cochlear implants (CIs) remains impoverished in spectro-temporal fine structure. As a consequence, pitch-dominant information such as voice emotion, is diminished. For young children, the ability to correctly identify the mood/intent of the speaker (which may not always be visible in their facial expression) is an important aspect of social and linguistic development. Previous work in the field has shown that children with cochlear implants (cCI) have significant deficits in voice emotion recognition relative to their normally hearing peers (cNH). Here, we report on voice emotion recognition by a cohort of 36 school-aged cCI. Additionally, we provide for the first time, a comparison of their performance to that of cNH and NH adults (aNH) listening to CI simulations of the same stimuli. We also provide comparisons to the performance of adult listeners with CIs (aCI), most of whom learned language primarily through normal acoustic hearing. Results indicate that, despite strong variability, on average, cCI perform similarly to their adult counterparts; that both groups' mean performance is similar to aNHs' performance with 8-channel noise-vocoded speech; that cNH achieve excellent scores in voice emotion recognition with full-spectrum speech, but on average, show significantly poorer scores than aNH with 8-channel noise-vocoded speech. A strong developmental effect was observed in the cNH with noise-vocoded speech in this task. These results point to the considerable benefit obtained by cochlear-implanted children from their devices, but also underscore the need for further research and development in this important and neglected area. This article is part of a Special Issue entitled .
Collapse
Affiliation(s)
- Monita Chatterjee
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA.
| | - Danielle J Zion
- Department of Hearing & Speech Sciences, University of Maryland, 0100 LeFrak Hall, College Park, MD 20742, USA
| | - Mickael L Deroche
- Department of Otolaryngology, Johns Hopkins University School of Medicine, 818 Ross Research Building, 720 Rutland Avenue, Baltimore, MD, USA
| | - Brooke A Burianek
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA
| | - Charles J Limb
- Department of Otolaryngology, Johns Hopkins University School of Medicine, 818 Ross Research Building, 720 Rutland Avenue, Baltimore, MD, USA
| | - Alison P Goren
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA; Department of Hearing & Speech Sciences, University of Maryland, 0100 LeFrak Hall, College Park, MD 20742, USA
| | - Aditya M Kulkarni
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA
| | - Julie A Christensen
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA
| |
Collapse
|
27
|
Gaudrain E, Başkent D. Factors limiting vocal-tract length discrimination in cochlear implant simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:1298-1308. [PMID: 25786943 DOI: 10.1121/1.4908235] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Perception of voice characteristics allows normal hearing listeners to identify the gender of a speaker, and to better segregate speakers from each other in cocktail party situations. This benefit is largely driven by the perception of two vocal characteristics of the speaker: The fundamental frequency (F0) and the vocal-tract length (VTL). Previous studies have suggested that cochlear implant (CI) users have difficulties in perceiving these cues. The aim of the present study was to investigate possible causes for limited sensitivity to VTL differences in CI users. Different acoustic simulations of CI stimulation were implemented to characterize the role of spectral resolution on VTL, both in terms of number of channels and amount of channel interaction. The results indicate that with 12 channels, channel interaction caused by current spread is likely to prevent CI users from perceiving VTL differences typically found between male and female speakers.
Collapse
Affiliation(s)
- Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| |
Collapse
|
28
|
Stilp CE, Goupell MJ. Spectral and temporal resolutions of information-bearing acoustic changes for understanding vocoded sentences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:844-55. [PMID: 25698018 PMCID: PMC4336249 DOI: 10.1121/1.4906179] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Revised: 12/12/2014] [Accepted: 12/27/2014] [Indexed: 06/04/2023]
Abstract
Short-time spectral changes in the speech signal are important for understanding noise-vocoded sentences. These information-bearing acoustic changes, measured using cochlea-scaled entropy in cochlear implant simulations [CSECI; Stilp et al. (2013). J. Acoust. Soc. Am. 133(2), EL136-EL141; Stilp (2014). J. Acoust. Soc. Am. 135(3), 1518-1529], may offer better understanding of speech perception by cochlear implant (CI) users. However, perceptual importance of CSECI for normal-hearing listeners was tested at only one spectral resolution and one temporal resolution, limiting generalizability of results to CI users. Here, experiments investigated the importance of these informational changes for understanding noise-vocoded sentences at different spectral resolutions (4-24 spectral channels; Experiment 1), temporal resolutions (4-64 Hz cutoff for low-pass filters that extracted amplitude envelopes; Experiment 2), or when both parameters varied (6-12 channels, 8-32 Hz; Experiment 3). Sentence intelligibility was reduced more by replacing high-CSECI intervals with noise than replacing low-CSECI intervals, but only when sentences had sufficient spectral and/or temporal resolution. High-CSECI intervals were more important for speech understanding as spectral resolution worsened and temporal resolution improved. Trade-offs between CSECI and intermediate spectral and temporal resolutions were minimal. These results suggest that signal processing strategies that emphasize information-bearing acoustic changes in speech may improve speech perception for CI users.
Collapse
Affiliation(s)
- Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742
| |
Collapse
|
29
|
Interdependence of linguistic and indexical speech perception skills in school-age children with early cochlear implantation. Ear Hear 2014; 34:562-74. [PMID: 23652814 DOI: 10.1097/aud.0b013e31828d2bd6] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES This study documented the ability of experienced pediatric cochlear implant (CI) users to perceive linguistic properties (what is said) and indexical attributes (emotional intent and talker identity) of speech, and examined the extent to which linguistic (LSP) and indexical (ISP) perception skills are related. Preimplant-aided hearing, age at implantation, speech processor technology, CI-aided thresholds, sequential bilateral cochlear implantation, and academic integration with hearing age-mates were examined for their possible relationships to both LSP and ISP skills. DESIGN Sixty 9- to 12-year olds, first implanted at an early age (12 to 38 months), participated in a comprehensive test battery that included the following LSP skills: (1) recognition of monosyllabic words at loud and soft levels, (2) repetition of phonemes and suprasegmental features from nonwords, and (3) recognition of key words from sentences presented within a noise background, and the following ISP skills: (1) discrimination of across-gender and within-gender (female) talkers and (2) identification and discrimination of emotional content from spoken sentences. A group of 30 age-matched children without hearing loss completed the nonword repetition, and talker- and emotion-perception tasks for comparison. RESULTS Word-recognition scores decreased with signal level from a mean of 77% correct at 70 dB SPL to 52% at 50 dB SPL. On average, CI users recognized 50% of key words presented in sentences that were 9.8 dB above background noise. Phonetic properties were repeated from nonword stimuli at about the same level of accuracy as suprasegmental attributes (70 and 75%, respectively). The majority of CI users identified emotional content and differentiated talkers significantly above chance levels. Scores on LSP and ISP measures were combined into separate principal component scores and these components were highly correlated (r = 0.76). Both LSP and ISP component scores were higher for children who received a CI at the youngest ages, upgraded to more recent CI technology and had lower CI-aided thresholds. Higher scores, for both LSP and ISP components, were also associated with higher language levels and mainstreaming at younger ages. Higher ISP scores were associated with better social skills. CONCLUSIONS Results strongly support a link between indexical and linguistic properties in perceptual analysis of speech. These two channels of information appear to be processed together in parallel by the auditory system and are inseparable in perception. Better speech performance, for both linguistic and indexical perception, is associated with younger age at implantation and use of more recent speech processor technology. Children with better speech perception demonstrated better spoken language, earlier academic mainstreaming, and placement in more typically sized classrooms (i.e., >20 students). Well-developed social skills were more highly associated with the ability to discriminate the nuances of talker identity and emotion than with the ability to recognize words and sentences through listening. The extent to which early cochlear implantation enabled these early-implanted children to make use of both linguistic and indexical properties of speech influenced not only their development of spoken language, but also their ability to function successfully in a hearing world.
Collapse
|
30
|
Engineer CT, Perez CA, Carraway RS, Chang KQ, Roland JL, Sloan AM, Kilgard MP. Similarity of cortical activity patterns predicts generalization behavior. PLoS One 2013; 8:e78607. [PMID: 24147140 PMCID: PMC3797841 DOI: 10.1371/journal.pone.0078607] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2013] [Accepted: 09/20/2013] [Indexed: 11/23/2022] Open
Abstract
Humans and animals readily generalize previously learned knowledge to new situations. Determining similarity is critical for assigning category membership to a novel stimulus. We tested the hypothesis that category membership is initially encoded by the similarity of the activity pattern evoked by a novel stimulus to the patterns from known categories. We provide behavioral and neurophysiological evidence that activity patterns in primary auditory cortex contain sufficient information to explain behavioral categorization of novel speech sounds by rats. Our results suggest that category membership might be encoded by the similarity of the activity pattern evoked by a novel speech sound to the patterns evoked by known sounds. Categorization based on featureless pattern matching may represent a general neural mechanism for ensuring accurate generalization across sensory and cognitive systems.
Collapse
Affiliation(s)
- Crystal T. Engineer
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
- * E-mail:
| | - Claudia A. Perez
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
| | - Ryan S. Carraway
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
| | - Kevin Q. Chang
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
| | - Jarod L. Roland
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
| | - Andrew M. Sloan
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
| | - Michael P. Kilgard
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
| |
Collapse
|
31
|
Massida Z, Marx M, Belin P, James C, Fraysse B, Barone P, Deguine O. Gender categorization in cochlear implant users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2013; 56:1389-1401. [PMID: 24023381 DOI: 10.1044/1092-4388(2013/12-0132)] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
PURPOSE In this study, the authors examined the ability of subjects with cochlear implants (CIs) to discriminate voice gender and how this ability evolved as a function of CI experience. METHOD The authors presented a continuum of voice samples created by voice morphing, with 9 intermediate acoustic parameter steps between a typical male and a typical female. This method allowed for the evaluation of gender categorization not only when acoustical features were specific to gender but also for more ambiguous cases, when fundamental frequency or formant distribution were located between typical values. RESULTS Results showed a global, though variable, deficit for voice gender categorization in CI recipients compared with subjects with normal hearing. This deficit was stronger for ambiguous stimuli in the voice continuum: Average performance scores for CI users were 58% lower than average scores for subjects with normal hearing in cases of ambiguous stimuli and 19% lower for typical male and female voices. The authors found no significant improvement in voice gender categorization with CI experience. CONCLUSIONS These results emphasize the dissociation between recovery of speech recognition and voice feature perception after cochlear implantation. This large and durable deficit may be related to spectral and temporal degradation induced by CI sound coding, or it may be related to central voice processing deficits.
Collapse
|
32
|
Impaired Timing and Frequency Discrimination in High-functioning Autism Spectrum Disorders. J Autism Dev Disord 2013; 43:2312-28. [DOI: 10.1007/s10803-013-1778-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
33
|
Peng SC, Chatterjee M, Lu N. Acoustic cue integration in speech intonation recognition with cochlear implants. Trends Amplif 2012; 16:67-82. [PMID: 22790392 PMCID: PMC3560417 DOI: 10.1177/1084713812451159] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The present article reports on the perceptual weighting of prosodic cues in question-statement identification by adult cochlear implant (CI) listeners. Acoustic analyses of normal-hearing (NH) listeners' production of sentences spoken as questions or statements confirmed that in English the last bisyllabic word in a sentence carries the dominant cues (F0, duration, and intensity patterns) for the contrast. Furthermore, these analyses showed that the F0 contour is the primary cue for the question-statement contrast, with intensity and duration changes conveying important but less reliable information. On the basis of these acoustic findings, the authors examined adult CI listeners' performance in two question-statement identification tasks. In Task 1, 13 CI listeners' question-statement identification accuracy was measured using naturally uttered sentences matched for their syntactic structures. In Task 2, the same listeners' perceptual cue weighting in question-statement identification was assessed using resynthesized single-word stimuli, within which fundamental frequency (F0), intensity, and duration properties were systematically manipulated. Both tasks were also conducted with four NH listeners with full-spectrum and noise-band-vocoded stimuli. Perceptual cue weighting was assessed by comparing the estimated coefficients in logistic models fitted to the data. Of the 13 CI listeners, 7 achieved high performance levels in Task 1. The results of Task 2 indicated that multiple sources of acoustic cues for question-statement identification were utilized to different extents depending on the listening conditions (e.g., full spectrum vs. spectrally degraded) or the listeners' hearing and amplification status (e.g., CI vs. NH).
Collapse
Affiliation(s)
- Shu-Chen Peng
- Division of Ophthalmic, Neurological, and Ear, Nose and Throat Devices, Office of Device Evaluation, U.S. Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993, USA.
| | | | | |
Collapse
|