1
|
Kalaivanan K. Lexical tone perception and learning in older adults: A review and future directions. Q J Exp Psychol (Hove) 2023:17470218231211722. [PMID: 37873972 DOI: 10.1177/17470218231211722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
While the literature is well represented in accounting for how aging influences segmental properties of speech, less is known about its influences on suprasegmental properties such as lexical tones. In addition, foreign language learning is increasingly endorsed as being a potential intervention to boost cognitive reserve and overall well-being in older adults. Empirical studies on young learners learning lexical tones are aplenty in comparison with older learners. Challenges in this domain for older learners might be different due to aging and other learner-internal factors. This review consolidates behavioural and neuroscientific research related to lexical tone, speech perception, factors characterising learner groups, and other variables that would influence lexical tone perception and learning in older adults. Factors commonly identified to influence tone learning in younger adult populations, such as musical experience, language background, and motivation in learning a new language, are discussed in relation to older learner groups and recommendations to boost lexical tone learning in older age are provided based on existing studies.
Collapse
Affiliation(s)
- Kastoori Kalaivanan
- Neuroscience and Behavioural Disorders Programme, DUKE-NUS Medical School, Singapore
| |
Collapse
|
2
|
Roux J, Hanekom JJ. Effect of stimulation parameters on sequential current-steered stimuli in cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:609. [PMID: 35931549 DOI: 10.1121/10.0012763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 07/01/2022] [Indexed: 06/15/2023]
Abstract
Manipulation of cochlear implant (CI) place pitch was carried out with current steering by stimulating two CI electrodes sequentially. The objective was to investigate whether shifts in activated neural populations could be achieved to produce salient pitch differences and to determine which stimulation parameters would be more effective in steering of current. These were the pulse rate and pulse width of electrical stimuli and the distance between the two current-steering electrodes. Nine CI users participated, and ten ears were tested. The pattern of pitch changes was not consistent across listeners, but the data suggest that individualized selection of stimulation parameters may be used to effect place pitch changes with sequential current steering. Individual analyses showed that pulse width generally had little influence on the effectiveness of current steering with sequential stimuli, while more salient place pitch shifts were often achieved at wider electrode spacing or when the stimulation pulse rate was the same as that indicated on the clinical MAP (the set of stimulation parameters) of the listener. Results imply that current steering may be used in CIs that allow only sequential stimulation to achieve place pitch manipulation.
Collapse
Affiliation(s)
- Johanie Roux
- Bioengineering, Department of Electrical, Electronic, and Computer Engineering, University of Pretoria, University Road, Pretoria 0002, South Africa
| | - Johan J Hanekom
- Bioengineering, Department of Electrical, Electronic, and Computer Engineering, University of Pretoria, University Road, Pretoria 0002, South Africa
| |
Collapse
|
3
|
Huang EHH, Wu CM, Lin HC. Combination and Comparison of Sound Coding Strategies Using Cochlear Implant Simulation With Mandarin Speech. IEEE Trans Neural Syst Rehabil Eng 2021; 29:2407-2416. [PMID: 34767509 DOI: 10.1109/tnsre.2021.3128064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Three cochlear implant (CI) sound coding strategies were combined in the same signal processing path and compared for speech intelligibility with vocoded Mandarin sentences. The three CI coding strategies, biologically-inspired hearing aid algorithm (BioAid), envelope enhancement (EE), and fundamental frequency modulation (F0mod), were combined with the advanced combination encoder (ACE) strategy. Hence, four singular coding strategies and four combinational coding strategies were derived. Mandarin sentences with speech-shape noise were processed using these coding strategies. Speech understanding of vocoded Mandarin sentences was evaluated using short-time objective intelligibility (STOI) and subjective sentence recognition tests with normal-hearing listeners. For signal-to-noise ratios at 5 dB or above, the EE strategy had slightly higher average scores in both STOI and listening tests compared to ACE. The addition of EE to BioAid slightly increased the mean scores for BioAid+EE, which was the combination strategy with the highest scores in both objective and subjective speech intelligibility. The benefits of BioAid, F0mod, and the four combinational coding strategies were not observed in CI simulation. The findings of this study may be useful for the future design of coding strategies and related studies with Mandarin.
Collapse
|
4
|
Arjmandi M, Houston D, Wang Y, Dilley L. Estimating the reduced benefit of infant-directed speech in cochlear implant-related speech processing. Neurosci Res 2021; 171:49-61. [PMID: 33484749 PMCID: PMC8289972 DOI: 10.1016/j.neures.2021.01.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 12/19/2020] [Accepted: 01/17/2021] [Indexed: 11/27/2022]
Abstract
Caregivers modify their speech when talking to infants, a specific type of speech known as infant-directed speech (IDS). This speaking style facilitates language learning compared to adult-directed speech (ADS) in infants with normal hearing (NH). While infants with NH and those with cochlear implants (CIs) prefer listening to IDS over ADS, it is yet unknown how CI processing may affect the acoustic distinctiveness between ADS and IDS, as well as the degree of intelligibility of these. This study analyzed speech of seven female adult talkers to model the effects of simulated CI processing on (1) acoustic distinctiveness between ADS and IDS, (2) estimates of intelligibility of caregivers' speech in ADS and IDS, and (3) individual differences in caregivers' ADS-to-IDS modification and estimated speech intelligibility. Results suggest that CI processing is substantially detrimental to the acoustic distinctiveness between ADS and IDS, as well as to the intelligibility benefit derived from ADS-to-IDS modifications. Moreover, the observed variability across individual talkers in acoustic implementation of ADS-to-IDS modification and the estimated speech intelligibility was significantly reduced due to CI processing. The findings are discussed in the context of the link between IDS and language learning in infants with CIs.
Collapse
Affiliation(s)
- Meisam Arjmandi
- Department of Communicative Sciences and Disorders, Michigan State University, 1026 Red Cedar Road, East Lansing, MI 48824, USA.
| | - Derek Houston
- Department of Otolaryngology - Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212, USA
| | - Yuanyuan Wang
- Department of Otolaryngology - Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212, USA
| | - Laura Dilley
- Department of Communicative Sciences and Disorders, Michigan State University, 1026 Red Cedar Road, East Lansing, MI 48824, USA
| |
Collapse
|
5
|
Electro-Tactile Stimulation Enhances Cochlear-Implant Melody Recognition: Effects of Rhythm and Musical Training. Ear Hear 2021; 41:106-113. [PMID: 31884501 DOI: 10.1097/aud.0000000000000749] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Electro-acoustic stimulation (EAS) enhances speech and music perception in cochlear-implant (CI) users who have residual low-frequency acoustic hearing. For CI users who do not have low-frequency acoustic hearing, tactile stimulation may be used in a similar fashion as residual low-frequency acoustic hearing to enhance CI performance. Previous studies showed that electro-tactile stimulation (ETS) enhanced speech recognition in noise and tonal language perception for CI listeners. Here, we examined the effect of ETS on melody recognition in both musician and nonmusician CI users. DESIGN Nine musician and eight nonmusician CI users were tested in a melody recognition task with or without rhythmic cues in three testing conditions: CI only (E), tactile only (T), and combined CI and tactile stimulation (ETS). RESULTS Overall, the combined electrical and tactile stimulation enhanced the melody recognition performance in CI users by 9% points. Two additional findings were observed. First, musician CI users outperformed nonmusicians CI users in melody recognition, but the size of the enhancement effect was similar between the two groups. Second, the ETS enhancement was significantly higher with nonrhythmic melodies than rhythmic melodies in both groups. CONCLUSIONS These findings suggest that, independent of musical experience, the size of the ETS enhancement depends on integration efficiency between tactile and auditory stimulation, and that the mechanism of the ETS enhancement is improved electric pitch perception. The present study supports the hypothesis that tactile stimulation can be used to improve pitch perception in CI users.
Collapse
|
6
|
Lu T, Li Q, Zhang C, Chen M, Wang Z, Li S. The sensitivity of different methods for detecting abnormalities in auditory nerve function. Biomed Eng Online 2020; 19:7. [PMID: 32013979 PMCID: PMC6998811 DOI: 10.1186/s12938-020-0750-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 01/22/2020] [Indexed: 11/18/2022] Open
Abstract
Background Cochlear implants (CIs) have become important for the treatment of severe-to-profound sensorineural hearing loss (SNHL). Meanwhile, electrically evoked compound action potentials (ECAPs) and electrically evoked auditory brainstem responses (EABRs), which can be examined and evaluated with minimal patient cooperation, have become more reliable for tone measurement and speech recognition postoperatively. However, few studies have compared the electrophysiological characteristics of the auditory nerve using ECAPs and EABRs under different functional states of the auditory nerve (FSANs). We used guinea pig models in which six electrodes were implanted unilaterally with continuous electrical stimulation (ES) for 4 h. The amplitude growth functions (AGFs) of the alternating polarity ECAP (AP-ECAP) and forward-masking subtraction ECAP (FM-ECAP), as well as the EABR waves under “normal” and “abnormal” FSANs, were obtained. Results Both the AP-ECAP and FM-ECAP thresholds were significantly higher than those measured by EABR under both “normal” FSAN and “abnormal” FSANs (p < 0.05). There was a significant difference in the slope values between electrodes 1 and 2 and electrodes 3 and 4 in terms of the AP-ECAP under the “abnormal” FSAN (p < 0.05). The threshold gaps between the AP-ECAP and FM-ECAP were significantly larger under the “abnormal” FSAN than under the “normal” FSAN (p < 0.05). Conclusions Both of the ECAP thresholds were higher than the EABR thresholds. The AP-ECAP was more sensitive than the FM-ECAP under the “abnormal” FSAN.
Collapse
|
7
|
The Temporal Fine Structure of Background Noise Determines the Benefit of Bimodal Hearing for Recognizing Speech. J Assoc Res Otolaryngol 2020; 21:527-544. [PMID: 33104927 PMCID: PMC7644728 DOI: 10.1007/s10162-020-00772-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 10/14/2020] [Indexed: 01/01/2023] Open
Abstract
Cochlear implant (CI) users have more difficulty understanding speech in temporally modulated noise than in steady-state (SS) noise. This is thought to be caused by the limited low-frequency information that CIs provide, as well as by the envelope coding in CIs that discards the temporal fine structure (TFS). Contralateral amplification with a hearing aid, referred to as bimodal hearing, can potentially provide CI users with TFS cues to complement the envelope cues provided by the CI signal. In this study, we investigated whether the use of a CI alone provides access to only envelope cues and whether acoustic amplification can provide additional access to TFS cues. To this end, we evaluated speech recognition in bimodal listeners, using SS noise and two amplitude-modulated noise types, namely babble noise and amplitude-modulated steady-state (AMSS) noise. We hypothesized that speech recognition in noise depends on the envelope of the noise, but not on its TFS when listening with a CI. Secondly, we hypothesized that the amount of benefit gained by the addition of a contralateral hearing aid depends on both the envelope and TFS of the noise. The two amplitude-modulated noise types decreased speech recognition more effectively than SS noise. Against expectations, however, we found that babble noise decreased speech recognition more effectively than AMSS noise in the CI-only condition. Therefore, we rejected our hypothesis that TFS is not available to CI users. In line with expectations, we found that the bimodal benefit was highest in babble noise. However, there was no significant difference between the bimodal benefit obtained in SS and AMSS noise. Our results suggest that a CI alone can provide TFS cues and that bimodal benefits in noise depend on TFS, but not on the envelope of the noise.
Collapse
|
8
|
Shen J, Souza PE. The ability to glimpse dynamic pitch in noise by younger and older listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:EL232. [PMID: 31590538 PMCID: PMC6748858 DOI: 10.1121/1.5126021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 08/12/2019] [Accepted: 08/22/2019] [Indexed: 06/10/2023]
Abstract
While dynamic pitch is helpful for speech perception in temporally-modulated noise, the ability to benefit from this cue varies substantially among older listeners. To examine the perceptual factors that contribute to this variability, this study aimed to characterize individuals' ability to perceive dynamic pitch in temporally-modulated noise using dynamic pitch segments extracted from real speech and embedded in temporally modulated noise. Data from younger and older listeners showed stronger pitch contours were more easily perceived than weaker pitch contours. The metric significantly predicted speech-in-noise ability in older listeners. Potential implications of this work are discussed.
Collapse
Affiliation(s)
- Jing Shen
- Western Michigan University, Kalamazoo, Michigan 490008, USA
| | | |
Collapse
|
9
|
Gaudrain E, Başkent D. Discrimination of Voice Pitch and Vocal-Tract Length in Cochlear Implant Users. Ear Hear 2019; 39:226-237. [PMID: 28799983 PMCID: PMC5839701 DOI: 10.1097/aud.0000000000000480] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 06/29/2017] [Indexed: 12/02/2022]
Abstract
OBJECTIVES When listening to two competing speakers, normal-hearing (NH) listeners can take advantage of voice differences between the speakers. Users of cochlear implants (CIs) have difficulty in perceiving speech on speech. Previous literature has indicated sensitivity to voice pitch (related to fundamental frequency, F0) to be poor among implant users, while sensitivity to vocal-tract length (VTL; related to the height of the speaker and formant frequencies), the other principal voice characteristic, has not been directly investigated in CIs. A few recent studies evaluated F0 and VTL perception indirectly, through voice gender categorization, which relies on perception of both voice cues. These studies revealed that, contrary to prior literature, CI users seem to rely exclusively on F0 while not utilizing VTL to perform this task. The objective of the present study was to directly and systematically assess raw sensitivity to F0 and VTL differences in CI users to define the extent of the deficit in voice perception. DESIGN The just-noticeable differences (JNDs) for F0 and VTL were measured in 11 CI listeners using triplets of consonant-vowel syllables in an adaptive three-alternative forced choice method. RESULTS The results showed that while NH listeners had average JNDs of 1.95 and 1.73 semitones (st) for F0 and VTL, respectively, CI listeners showed JNDs of 9.19 and 7.19 st. These JNDs correspond to differences of 70% in F0 and 52% in VTL. For comparison to the natural range of voices in the population, the F0 JND in CIs remains smaller than the typical male-female F0 difference. However, the average VTL JND in CIs is about twice as large as the typical male-female VTL difference. CONCLUSIONS These findings, thus, directly confirm that CI listeners do not seem to have sufficient access to VTL cues, likely as a result of limited spectral resolution, and, hence, that CI listeners' voice perception deficit goes beyond poor perception of F0. These results provide a potential common explanation not only for a number of deficits observed in CI listeners, such as voice identification and gender categorization, but also for competing speech perception.
Collapse
Affiliation(s)
- Etienne Gaudrain
- University of Groningen, University Medical Center Groningen, Department of Otorhinolaryngology-Head and Neck Surgery, Groningen, The Netherlands; CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université Lyon, Lyon, France; and Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- University of Groningen, University Medical Center Groningen, Department of Otorhinolaryngology-Head and Neck Surgery, Groningen, The Netherlands; CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université Lyon, Lyon, France; and Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
10
|
Müller V, Klünter H, Fürstenberg D, Meister H, Walger M, Lang-Roth R. Examination of Prosody and Timbre Perception in Adults With Cochlear Implants Comparing Different Fine Structure Coding Strategies. Am J Audiol 2018. [PMID: 29536106 DOI: 10.1044/2017_aja-17-0046] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE This study aimed to investigate whether adults with cochlear implants benefit from a change of fine structure (FS) coding strategies regarding the discrimination of prosodic speech cues, timbre cues, and the identification of natural instruments. The FS processing (FSP) coding strategy was compared to 2 settings of the FS4 strategy. METHOD A longitudinal crossover, double-blinded study was conducted. This study consisted of 2 parts, with 14 participants in the first part and 12 participants in the second part. Each part lasted 3 months, in which participants were alternately fitted with either the established FSP strategy or 1 of the 2 newly developed FS4 settings. Participants had to complete an intonation identification test; a timbre discrimination test in which 1 of 2 isolated cues changed, either the spectral centroid or the spectral irregularity; and an instrument identification test. RESULTS A significant effect was seen in the discrimination of spectral irregularity with 1 of the 2 FS4 settings. The improvement was seen in the FS4 setting in which the upper envelope channels had a low stimulation rate. This improvement was not seen with the FS4 setting that had a higher stimulation rate on the envelope channels. CONCLUSIONS In general, the FSP strategy and the 2 settings of the FS4 strategy provided similar levels in the perception of prosody and timbre cues, as well as in the identification of instruments.
Collapse
Affiliation(s)
- Verena Müller
- Clinic of Otorhinolaryngology, Head and Neck Surgery and Cochlear Implant Centre, University of Cologne, Germany
| | - Heinz Klünter
- Clinic of Otorhinolaryngology, Head and Neck Surgery and Cochlear Implant Centre, University of Cologne, Germany
| | - Dirk Fürstenberg
- Clinic of Otorhinolaryngology, Head and Neck Surgery and Cochlear Implant Centre, University of Cologne, Germany
| | - Hartmut Meister
- Jean Uhrmacher Institute for Clinical ENT-Research, University of Cologne, Germany
| | - Martin Walger
- Clinic of Otorhinolaryngology, Head and Neck Surgery and Cochlear Implant Centre, University of Cologne, Germany
- Jean Uhrmacher Institute for Clinical ENT-Research, University of Cologne, Germany
| | - Ruth Lang-Roth
- Clinic of Otorhinolaryngology, Head and Neck Surgery and Cochlear Implant Centre, University of Cologne, Germany
| |
Collapse
|
11
|
Abstract
OBJECTIVES Cochlear-implant (CI) users with single-sided deafness (SSD)-that is, one normal-hearing (NH) ear and one CI ear-can obtain some unmasking benefits when a mixture of target and masking voices is presented to the NH ear and a copy of just the masking voices is presented to the CI ear. NH listeners show similar benefits in a simulation of SSD-CI listening, whereby a mixture of target and masking voices is presented to one ear and a vocoded copy of the masking voices is presented to the opposite ear. However, the magnitude of the benefit for SSD-CI listeners is highly variable across individuals and is on average less than for NH listeners presented with vocoded stimuli. One possible explanation for the limited benefit observed for some SSD-CI users is that temporal and spectral discrepancies between the acoustic and electric ears might interfere with contralateral unmasking. The present study presented vocoder simulations to NH participants to examine the effects of interaural temporal and spectral mismatches on contralateral unmasking. DESIGN Speech-reception performance was measured in a competing-talker paradigm for NH listeners presented with vocoder simulations of SSD-CI listening. In the monaural condition, listeners identified target speech masked by two same-gender interferers, presented to the left ear. In the bilateral condition, the same stimuli were presented to the left ear, but the right ear was presented with a noise-vocoded copy of the interfering voices. This paradigm tested whether listeners could integrate the interfering voices across the ears to better hear the monaural target. Three common distortions inherent in CI processing were introduced to the vocoder processing: spectral shifts, temporal delays, and reduced frequency selectivity. RESULTS In experiment 1, contralateral unmasking (i.e., the benefit from adding the vocoded maskers to the second ear) was impaired by spectral mismatches of four equivalent rectangular bandwidths or greater. This is equivalent to roughly a 3.6-mm mismatch between the cochlear places stimulated in the electric and acoustic ears, which is on the low end of the average expected mismatch for SSD-CI listeners. In experiment 2, performance was negatively affected by a temporal mismatch of 24 ms or greater, but not for mismatches in the 0 to 12 ms range expected for SSD-CI listeners. Experiment 3 showed an interaction between spectral shift and spectral resolution, with less effect of interaural spectral mismatches when the number of vocoder channels was reduced. Experiment 4 applied interaural spectral and temporal mismatches in combination. Performance was best when both frequency and timing were aligned, but in cases where a mismatch was present in one dimension (either frequency or latency), the addition of mismatch in the second dimension did not further disrupt performance. CONCLUSIONS These results emphasize the need for interaural alignment-in timing and especially in frequency-to maximize contralateral unmasking for NH listeners presented with vocoder simulations of SSD-CI listening. Improved processing strategies that reduce mismatch between the electric and acoustic ears of SSD-CI listeners might improve their ability to obtain binaural benefits in multitalker environments.
Collapse
|
12
|
Shen J, Wright R, Souza PE. On Older Listeners' Ability to Perceive Dynamic Pitch. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2016; 59:572-582. [PMID: 27177161 PMCID: PMC4972016 DOI: 10.1044/2015_jslhr-h-15-0228] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 09/29/2015] [Accepted: 11/25/2015] [Indexed: 06/01/2023]
Abstract
PURPOSE Natural speech comes with variation in pitch, which serves as an important cue for speech recognition. The present study investigated older listeners' dynamic pitch perception with a focus on interindividual variability. In particular, we asked whether some of the older listeners' inability to perceive dynamic pitch stems from the higher susceptibility to the interference from formant changes. METHOD A total of 22 older listeners and 21 younger controls with at least near-typical hearing were tested on dynamic pitch identification and discrimination tasks using synthetic monophthong and diphthong vowels. RESULTS The older listeners' ability to detect changes in pitch varied substantially, even when musical and linguistic experiences were controlled. The influence of formant patterns on dynamic pitch perception was evident in both groups of listeners. Overall, strong pitch contours (i.e., more dynamic) were perceived better than weak pitch contours (i.e., more monotonic), particularly with rising pitch patterns. CONCLUSIONS The findings are in accordance with the literature demonstrating some older individuals' difficulty perceiving dynamic pitch cues in speech. Moreover, they suggest that this problem may be prominent when the dynamic pitch is carried by natural speech and when the pitch contour is not strong.
Collapse
Affiliation(s)
- Jing Shen
- Northwestern University, Evanston, IL
| | | | | |
Collapse
|
13
|
The Intelligibility of Interrupted Speech: Cochlear Implant Users and Normal Hearing Listeners. J Assoc Res Otolaryngol 2016; 17:475-91. [PMID: 27090115 PMCID: PMC5023536 DOI: 10.1007/s10162-016-0565-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Accepted: 03/18/2016] [Indexed: 11/13/2022] Open
Abstract
Compared with normal-hearing listeners, cochlear implant (CI) users display a loss of intelligibility of speech interrupted by silence or noise, possibly due to reduced ability to integrate and restore speech glimpses across silence or noise intervals. The present study was conducted to establish the extent of the deficit typical CI users have in understanding interrupted high-context sentences as a function of a range of interruption rates (1.5 to 24 Hz) and duty cycles (50 and 75 %). Further, factors such as reduced signal quality of CI signal transmission and advanced age, as well as potentially lower speech intelligibility of CI users even in the lack of interruption manipulation, were explored by presenting young, as well as age-matched, normal-hearing (NH) listeners with full-spectrum and vocoded speech (eight-channel and speech intelligibility baseline performance matched). While the actual CI users had more difficulties in understanding interrupted speech and taking advantage of faster interruption rates and increased duty cycle than the eight-channel noise-band vocoded listeners, their performance was similar to the matched noise-band vocoded listeners. These results suggest that while loss of spectro-temporal resolution indeed plays an important role in reduced intelligibility of interrupted speech, these factors alone cannot entirely explain the deficit. Other factors associated with real CIs, such as aging or failure in transmission of essential speech cues, seem to additionally contribute to poor intelligibility of interrupted speech.
Collapse
|
14
|
Zirn S, Polterauer D, Keller S, Hemmert W. The effect of fluctuating maskers on speech understanding of high-performing cochlear implant users. Int J Audiol 2016; 55:295-304. [PMID: 26865377 DOI: 10.3109/14992027.2015.1128124] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE The present study evaluated whether the poorer baseline performance of cochlear implant (CI) users or the technical and/or physiological properties of CI stimulation are responsible for the absence of masking release. DESIGN This study measured speech reception thresholds (SRTs) in continuous and modulated noise as a function of signal to noise ratio (SNR). STUDY SAMPLE A total of 24 subjects participated: 12 normal-hearing (NH) listeners and 12 subjects provided with recent MED-EL CI systems. RESULTS The mean SRT of CI users in continuous noise was -3.0 ± 1.5 dB SNR (mean ± SEM), while the normal-hearing group reached -5.9 ± 0.8 dB SNR. In modulated noise, the difference across groups increased considerably. For CI users, the mean SRT worsened to -1.4 ± 2.3 dB SNR, while it improved for normal-hearing listeners to -18.9 ± 3.8 dB SNR. CONCLUSIONS The detrimental effect of fluctuating maskers on SRTs in CI users shown by prior studies was confirmed by the current study. Concluding, the absence of masking release is mainly caused by the technical and/or physiological properties of CI stimulation, not just the poorer baseline performance of many CI users compared to normal-hearing subjects. Speech understanding in modulated noise was more robust in CI users who had a relatively large electrical dynamic range.
Collapse
Affiliation(s)
- Stefan Zirn
- a University of Applied Science , Offenburg , Germany .,b Department of Otorhinolaryngology of the Medical Center , University of Freiburg , Germany .,c Department of Otolaryngology (ENT) / Head & Neck Surgery , University Medical Center of the Ludwig Maximilians University Munich , München , Germany , and.,d Bio-Inspired Information Processing, Technical University of Munich , Department of Electrical and Computer Engineering and Institute of Medical Engineering , Garching , Germany
| | - Daniel Polterauer
- c Department of Otolaryngology (ENT) / Head & Neck Surgery , University Medical Center of the Ludwig Maximilians University Munich , München , Germany , and
| | - Stefanie Keller
- d Bio-Inspired Information Processing, Technical University of Munich , Department of Electrical and Computer Engineering and Institute of Medical Engineering , Garching , Germany
| | - Werner Hemmert
- d Bio-Inspired Information Processing, Technical University of Munich , Department of Electrical and Computer Engineering and Institute of Medical Engineering , Garching , Germany
| |
Collapse
|
15
|
Zoefel B, VanRullen R. The Role of High-Level Processes for Oscillatory Phase Entrainment to Speech Sound. Front Hum Neurosci 2015; 9:651. [PMID: 26696863 PMCID: PMC4667100 DOI: 10.3389/fnhum.2015.00651] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 11/16/2015] [Indexed: 11/13/2022] Open
Abstract
Constantly bombarded with input, the brain has the need to filter out relevant information while ignoring the irrelevant rest. A powerful tool may be represented by neural oscillations which entrain their high-excitability phase to important input while their low-excitability phase attenuates irrelevant information. Indeed, the alignment between brain oscillations and speech improves intelligibility and helps dissociating speakers during a “cocktail party”. Although well-investigated, the contribution of low- and high-level processes to phase entrainment to speech sound has only recently begun to be understood. Here, we review those findings, and concentrate on three main results: (1) Phase entrainment to speech sound is modulated by attention or predictions, likely supported by top-down signals and indicating higher-level processes involved in the brain’s adjustment to speech. (2) As phase entrainment to speech can be observed without systematic fluctuations in sound amplitude or spectral content, it does not only reflect a passive steady-state “ringing” of the cochlea, but entails a higher-level process. (3) The role of intelligibility for phase entrainment is debated. Recent results suggest that intelligibility modulates the behavioral consequences of entrainment, rather than directly affecting the strength of entrainment in auditory regions. We conclude that phase entrainment to speech reflects a sophisticated mechanism: several high-level processes interact to optimally align neural oscillations with predicted events of high relevance, even when they are hidden in a continuous stream of background noise.
Collapse
Affiliation(s)
- Benedikt Zoefel
- Université Paul Sabatier Toulouse, France ; Centre de Recherche Cerveau et Cognition (CerCo), CNRS, UMR5549, Pavillon Baudot CHU Purpan Toulouse, France
| | - Rufin VanRullen
- Université Paul Sabatier Toulouse, France ; Centre de Recherche Cerveau et Cognition (CerCo), CNRS, UMR5549, Pavillon Baudot CHU Purpan Toulouse, France
| |
Collapse
|
16
|
Zhu S, Wong LLN, Chen F, Chen Y. Consonant discrimination by Mandarin-speaking children with prelingual hearing impairment. Int J Pediatr Otorhinolaryngol 2015; 79:1354-61. [PMID: 26112665 DOI: 10.1016/j.ijporl.2015.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Revised: 05/16/2015] [Accepted: 06/05/2015] [Indexed: 10/23/2022]
Abstract
OBJECTIVES Little is known about the consonant discrimination ability of Mandarin-speaking children with prelingual hearing impairment (HI) and fitted with hearing aids (HAs). The present study aimed to evaluate Mandarin consonant discrimination ability in children with HI, and explore the effects of unaided and aided hearing threshold, the age of first HA fitting and the duration of HA use on consonant discrimination ability. METHODS Subjects were Mandarin-speaking children aged 5;4-12;6 years with profound HI (n=41), children aged 6;1-12;4 years with severe HI (n=26), and children aged 5;0-11;9 years with moderate HI (n=9). The Mandarin Consonant Discrimination Test was administered in six test conditions: -10, -5, 0, 5 and 10dB signal to noise ratios (S/Ns) and quiet. HAs were in the usual user's settings, adjusted to match the manufacturer prescribed settings and individual preferences, and the volume was set to comfortable listening level. RESULTS The results revealed that /p(h)/-/t(h)/, /ts/-/tʂ/ and /ʐ/-/l/ were the most difficult and /p/-/p(h)/, /t/-/t(h)/, /tɕ/-/tɕ(h)/ and /k/-/k(h)/ were the easiest consonant minimal pairs to discriminate in quiet both for children with profound HI and those with moderate to severe HI. In noise, no significant difference in performance was found among all consonant minimal pairs. A backward elimination stepwise multiple linear regressions revealed that unaided hearing level accounted for 25.4% of the variance in consonant discrimination performance in noise at 10dB S/N and 30.4% in quiet. However, aided hearing threshold, the age of first HA fitting and the duration of HA use did not significantly predict consonant discrimination ability both in quiet and in noise. CONCLUSIONS Consonant discrimination performance of children with profound HI was poorer than those with moderate to severe HI. The ability to discriminate consonant pairs seems to depend on age of acquisition of the consonants. Although the age of first HA fitting and the duration of HA use were not correlated with consonant discrimination outcomes, this finding does not preclude the importance of early HA fitting.
Collapse
Affiliation(s)
- Shufeng Zhu
- Department of Electrical and Electronic Engineering, South University of Science and Technology of China, Shenzhen, China; Division of Speech and Hearing Sciences, The University of Hong Kong, Hong Kong.
| | - Lena L N Wong
- Division of Speech and Hearing Sciences, The University of Hong Kong, Hong Kong
| | - Fei Chen
- Department of Electrical and Electronic Engineering, South University of Science and Technology of China, Shenzhen, China; Division of Speech and Hearing Sciences, The University of Hong Kong, Hong Kong
| | - Yuan Chen
- Division of Speech and Hearing Sciences, The University of Hong Kong, Hong Kong
| |
Collapse
|
17
|
Kalathottukaren RT, Purdy SC, Ballard E. Prosody perception and musical pitch discrimination in adults using cochlear implants. Int J Audiol 2015; 54:444-52. [DOI: 10.3109/14992027.2014.997314] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
18
|
Tone and sentence perception in young Mandarin-speaking children with cochlear implants. Int J Pediatr Otorhinolaryngol 2014; 78:1923-30. [PMID: 25213422 DOI: 10.1016/j.ijporl.2014.08.025] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Revised: 08/16/2014] [Accepted: 08/18/2014] [Indexed: 11/24/2022]
Abstract
OBJECTIVES The purpose of this study was to examine the outcomes of cochlear implantation in young children in terms of (1) perception of lexical tones in quiet, (2) perception of sentences in quiet and in noise, (3) the effects of five demographic variables (i.e., preoperative hearing level, age at implantation, duration of cochlear implants use, maternal educational level, and whether a child underwent a hearing aid trial before implantation) on lexical tone perception and sentence perception, and (4) the relationship between lexical tone perception and sentence perception. METHODS 96 participants, aged from 2.41 years to 7.09 years, were recruited in mainland China. The children exhibited normal cognitive abilities and received unilateral implants at an average age of 2.72 years, with ages ranging from 0.69 to 5 years of age. RESULTS The mean score for tone identification was 77% (SD=13%; chance level=50%). Tone 2/tone 3 was the most difficult tone contrast to identify. Children with a longer duration of CI use and whose mothers had more years of education tended to perform better in sentence perception in quiet and in noise. Having undergone a hearing aid trial before implantation and more residual hearing were additional factors contributing to better sentence perception in noise. The only demographical variable that related to tone perception in quiet was duration of CI. In addition, while there was a modest correlation between tone perception and sentence perception in quiet (rs=0.47, p<0.001), the correlation between tone perception in quiet and sentence perception in noise was much weaker (rs=-0.28, p<0.05). CONCLUSIONS The findings suggested that most young children who had been implanted before 5 years of age and had 1-3 years of implant use did not catch up with their aged peers with normal hearing in tone perception and sentence perception. The weak to moderate correlation between tone perception in quiet and sentence perception might imply that the improvement of tone perception in quiet may not necessarily contribute to sentence perception, especially in noise condition.
Collapse
|
19
|
Ardoint M, Green T, Rosen S. The intelligibility of interrupted speech depends upon its uninterrupted intelligibility. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:EL275-EL280. [PMID: 25324110 DOI: 10.1121/1.4895096] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Recognition of sentences containing periodic, 5-Hz, silent interruptions of differing duty cycles was assessed for three types of processed speech. Processing conditions employed different combinations of spectral resolution and the availability of fundamental frequency (F0) information, chosen to yield similar, below-ceiling performance for uninterrupted speech. Performance declined with decreasing duty cycle similarly for each processing condition, suggesting that, at least for certain forms of speech processing and interruption rates, performance with interrupted speech may reflect that obtained with uninterrupted speech. This highlights the difficulty in interpreting differences in interrupted speech performance across conditions for which uninterrupted performance is at ceiling.
Collapse
Affiliation(s)
- Marine Ardoint
- Speech Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom , ,
| | - Tim Green
- Speech Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom , ,
| | - Stuart Rosen
- Speech Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom , ,
| |
Collapse
|
20
|
Luo X, Chang YP, Lin CY, Chang RY. Contribution of bimodal hearing to lexical tone normalization in Mandarin-speaking cochlear implant users. Hear Res 2014; 312:1-8. [PMID: 24576834 DOI: 10.1016/j.heares.2014.02.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Revised: 02/09/2014] [Accepted: 02/12/2014] [Indexed: 11/19/2022]
Abstract
Native Mandarin normal-hearing (NH) listeners can easily perceive lexical tones even under conditions of great voice pitch variations across speakers by using the pitch contrast between context and target stimuli. It is however unclear whether cochlear implant (CI) users with limited access to pitch cues can make similar use of context pitch cues for tone normalization. In this study, native Mandarin NH listeners and pre-lingually deafened unilaterally implanted CI users were asked to recognize a series of Mandarin tones varying from Tone 1 (high-flat) to Tone 2 (mid-rising) with or without a preceding sentence context. Most of the CI subjects used a hearing aid (HA) in the non-implanted ear (i.e., bimodal users) and were tested both with CI alone and CI + HA. In the test without context, typical S-shaped tone recognition functions were observed for most CI subjects and the function slopes and perceptual boundaries were similar with either CI alone or CI + HA. Compared to NH subjects, CI subjects were less sensitive to the pitch changes in target tones. In the test with context, NH subjects had more (resp. fewer) Tone-2 responses in a context with high (resp. low) fundamental frequencies, known as the contrastive context effect. For CI subjects, a similar contrastive context effect was found statistically significant for tone recognition with CI + HA but not with CI alone. The results suggest that the pitch cues from CIs may not be sufficient to consistently support the pitch contrast processing for tone normalization. The additional pitch cues from aided residual acoustic hearing can however provide CI users with a similar tone normalization capability as NH listeners.
Collapse
Affiliation(s)
- Xin Luo
- Department of Speech, Language, and Hearing Sciences, Purdue University, 500 Oval Drive, West Lafayette, IN 47907, USA.
| | - Yi-Ping Chang
- Speech and Hearing Science Research Institute, Children's Hearing Foundation, Taipei, Taiwan
| | - Chun-Yi Lin
- Speech and Hearing Science Research Institute, Children's Hearing Foundation, Taipei, Taiwan
| | - Ronald Y Chang
- Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
21
|
Holt CM, McDermott HJ. Discrimination of intonation contours by adolescents with cochlear implants. Int J Audiol 2013; 52:808-15. [PMID: 24053225 DOI: 10.3109/14992027.2013.832416] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE Differences in fundamental frequency (F0) contour peak alignment contribute to the perception of pitch accents in speech intonation. The present study assessed the discrimination of differences in F0 contour peak alignment by adolescent users of cochlear implants (CIs). DESIGN In Experiment 1, subjects discriminated between rise-fall F0 contours located early in the syllable and those aligned late. Recorded utterances with manipulated F0 were used as stimuli and all subjects wore a unilateral CI. In Experiment 2, bilaterally-implanted subjects repeated Experiment 1 in the bilateral condition. STUDY SAMPLE Twenty-one CI users aged 12-21 years participated. A normally-hearing control group (n = 20) also completed Experiment 1. RESULTS Listeners with normal hearing (NH) could discriminate between F0 peaks differing by 80 ms or more. Results varied among the CI users, with only four users displaying a pattern of results similar to that of the NH listeners. Sixteen CI users responded inconsistently or at chance levels (p > 0.05; binomial test). Ten CI users who were bilaterally implanted completed the tests in unilateral and bilateral listening conditions. CONCLUSIONS Results suggest that CI users may have difficulty discriminating between F0 alignment and that use of bilateral implants did not provide an advantage to discrimination.
Collapse
Affiliation(s)
- Colleen M Holt
- * Audiology and Speech Pathology, The University of Melbourne , Victoria , Australia
| | | |
Collapse
|
22
|
Green T, Rosen S, Faulkner A, Paterson R. Adaptation to spectrally-rotated speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:1369-1377. [PMID: 23927133 DOI: 10.1121/1.4812759] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Much recent interest surrounds listeners' abilities to adapt to various transformations that distort speech. An extreme example is spectral rotation, in which the spectrum of low-pass filtered speech is inverted around a center frequency (2 kHz here). Spectral shape and its dynamics are completely altered, rendering speech virtually unintelligible initially. However, intonation, rhythm, and contrasts in periodicity and aperiodicity are largely unaffected. Four normal hearing adults underwent 6 h of training with spectrally-rotated speech using Continuous Discourse Tracking. They and an untrained control group completed pre- and post-training speech perception tests, for which talkers differed from the training talker. Significantly improved recognition of spectrally-rotated sentences was observed for trained, but not untrained, participants. However, there were no significant improvements in the identification of medial vowels in /bVd/ syllables or intervocalic consonants. Additional tests were performed with speech materials manipulated so as to isolate the contribution of various speech features. These showed that preserving intonational contrasts did not contribute to the comprehension of spectrally-rotated speech after training, and suggested that improvements involved adaptation to altered spectral shape and dynamics, rather than just learning to focus on speech features relatively unaffected by the transformation.
Collapse
Affiliation(s)
- Tim Green
- Speech, Hearing, and Phonetic Sciences, UCL, Chandler House, 2, Wakefield Street, London, WC1N 1PF, United Kingdom
| | | | | | | |
Collapse
|
23
|
Olszewski C, Gfeller K, Froman R, Stordahl J, Tomblin B. Familiar melody recognition by children and adults using cochlear implants and normal hearing children. Cochlear Implants Int 2013; 6:123-40. [DOI: 10.1179/cim.2005.6.3.123] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
24
|
Wilkinson EP, Abdel-Hamid O, Galvin JJ, Jiang H, Fu QJ. Voice conversion in cochlear implantation. Laryngoscope 2013; 123 Suppl 3:S29-43. [PMID: 23299859 DOI: 10.1002/lary.23744] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2012] [Revised: 07/01/2012] [Accepted: 08/22/2012] [Indexed: 11/09/2022]
Abstract
OBJECTIVES/HYPOTHESIS Voice conversion algorithms may benefit cochlear implant (CI) users who better understand speech produced by one talker than by another. It is unclear how the source or target talker's fundamental frequency (F0) information may contribute to perception of converted speech. This study evaluated voice conversion algorithms for CI users in which the source or target talker's F0 was included in the converted speech. STUDY DESIGN Development and evaluation of computerized voice conversion algorithms in CI patients. METHODS A series of cepstral analysis-based algorithms were developed and evaluated in six CI users. The algorithms converted talker voice gender (male-to-female, or female-to-male); either the source or target talker F0 was included in the converted speech. The voice conversion algorithms were evaluated in terms of recognition of IEEE sentences, speech quality, and voice gender discrimination. RESULTS Voice gender recognition performance showed that listeners strongly cued to the F0 that was included within the converted speech. For both IEEE sentence recognition and voice quality ratings, performance was poorer with the voice conversion algorithms than with original speech. Performance on female-to-male conversion was superior to male-to-female conversion. CONCLUSION The strong cueing to F0 within the voice conversion algorithms suggests that CI users are able to utilize temporal periodicity information for some pitch-related tasks. Limitations on spectral channel information experienced by CI users may result in poorer performance with voice conversion algorithms due to distortion of speech formant information and degradation of the spectral envelope.
Collapse
|
25
|
Cortical processing of musical sounds in children with Cochlear Implants. Clin Neurophysiol 2012; 123:1966-79. [DOI: 10.1016/j.clinph.2012.03.008] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2011] [Revised: 02/26/2012] [Accepted: 03/04/2012] [Indexed: 11/23/2022]
|
26
|
Peng SC, Chatterjee M, Lu N. Acoustic cue integration in speech intonation recognition with cochlear implants. Trends Amplif 2012; 16:67-82. [PMID: 22790392 PMCID: PMC3560417 DOI: 10.1177/1084713812451159] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The present article reports on the perceptual weighting of prosodic cues in question-statement identification by adult cochlear implant (CI) listeners. Acoustic analyses of normal-hearing (NH) listeners' production of sentences spoken as questions or statements confirmed that in English the last bisyllabic word in a sentence carries the dominant cues (F0, duration, and intensity patterns) for the contrast. Furthermore, these analyses showed that the F0 contour is the primary cue for the question-statement contrast, with intensity and duration changes conveying important but less reliable information. On the basis of these acoustic findings, the authors examined adult CI listeners' performance in two question-statement identification tasks. In Task 1, 13 CI listeners' question-statement identification accuracy was measured using naturally uttered sentences matched for their syntactic structures. In Task 2, the same listeners' perceptual cue weighting in question-statement identification was assessed using resynthesized single-word stimuli, within which fundamental frequency (F0), intensity, and duration properties were systematically manipulated. Both tasks were also conducted with four NH listeners with full-spectrum and noise-band-vocoded stimuli. Perceptual cue weighting was assessed by comparing the estimated coefficients in logistic models fitted to the data. Of the 13 CI listeners, 7 achieved high performance levels in Task 1. The results of Task 2 indicated that multiple sources of acoustic cues for question-statement identification were utilized to different extents depending on the listening conditions (e.g., full spectrum vs. spectrally degraded) or the listeners' hearing and amplification status (e.g., CI vs. NH).
Collapse
Affiliation(s)
- Shu-Chen Peng
- Division of Ophthalmic, Neurological, and Ear, Nose and Throat Devices, Office of Device Evaluation, U.S. Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993, USA.
| | | | | |
Collapse
|
27
|
Krull V, Luo X, Iler Kirk K. Talker-identification training using simulations of binaurally combined electric and acoustic hearing: generalization to speech and emotion recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:3069-78. [PMID: 22501080 PMCID: PMC3339506 DOI: 10.1121/1.3688533] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Understanding speech in background noise, talker identification, and vocal emotion recognition are challenging for cochlear implant (CI) users due to poor spectral resolution and limited pitch cues with the CI. Recent studies have shown that bimodal CI users, that is, those CI users who wear a hearing aid (HA) in their non-implanted ear, receive benefit for understanding speech both in quiet and in noise. This study compared the efficacy of talker-identification training in two groups of young normal-hearing adults, listening to either acoustic simulations of unilateral CI or bimodal (CI+HA) hearing. Training resulted in improved identification of talkers for both groups with better overall performance for simulated bimodal hearing. Generalization of learning to sentence and emotion recognition also was assessed in both subject groups. Sentence recognition in quiet and in noise improved for both groups, no matter if the talkers had been heard during training or not. Generalization to improvements in emotion recognition for two unfamiliar talkers also was noted for both groups with the simulated bimodal-hearing group showing better overall emotion-recognition performance. Improvements in sentence recognition were retained a month after training in both groups. These results have potential implications for aural rehabilitation of conventional and bimodal CI users.
Collapse
Affiliation(s)
- Vidya Krull
- Department of Speech, Language, and Hearing Sciences, Purdue University, Heavilon Hall, 500 Oval Drive, West Lafayette, Indiana 47907, USA.
| | | | | |
Collapse
|
28
|
Carroll J, Tiaden S, Zeng FG. Fundamental frequency is critical to speech perception in noise in combined acoustic and electric hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:2054-62. [PMID: 21973360 PMCID: PMC3206909 DOI: 10.1121/1.3631563] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Revised: 05/11/2011] [Accepted: 08/05/2011] [Indexed: 05/25/2023]
Abstract
Cochlear implant (CI) users have been shown to benefit from residual low-frequency hearing, specifically in pitch related tasks. It remains unclear whether this benefit is dependent on fundamental frequency (F0) or other acoustic cues. Three experiments were conducted to determine the role of F0, as well as its frequency modulated (FM) and amplitude modulated (AM) components, in speech recognition with a competing voice. In simulated CI listeners, the signal-to-noise ratio was varied to estimate the 50% correct response. Simulation results showed that the F0 cue contributes to a significant proportion of the benefit seen with combined acoustic and electric hearing, and additionally that this benefit is due to the FM rather than the AM component. In actual CI users, sentence recognition scores were collected with either the full F0 cue containing both the FM and AM components or the 500-Hz low-pass speech cue containing the F0 and additional harmonics. The F0 cue provided a benefit similar to the low-pass cue for speech in noise, but not in quiet. Poorer CI users benefited more from the F0 cue than better users. These findings suggest that F0 is critical to improving speech perception in noise in combined acoustic and electric hearing.
Collapse
Affiliation(s)
- Jeff Carroll
- Hearing and Speech Research Laboratory, Department of Biomedical Engineering, University of California, Irvine, California 92697-5320, USA
| | | | | |
Collapse
|
29
|
Effects of age on F0 discrimination and intonation perception in simulated electric and electroacoustic hearing. Ear Hear 2011; 32:75-83. [PMID: 20739892 DOI: 10.1097/aud.0b013e3181eccfe9] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
OBJECTIVES Recent research suggests that older listeners may have difficulty processing information related to the fundamental frequency (F0) of voiced speech. In this study, the focus was on the mechanisms that may underlie this reduced ability. We examined whether increased age resulted in decreased ability to perceive F0 using fine-structure cues provided by the harmonic structure of voiced speech sounds or cues provided by high-rate envelope fluctuations (periodicity). DESIGN Younger listeners with normal hearing and older listeners with normal to near-normal hearing completed two tasks of F0 perception. In the first task (steady state F0), the fundamental frequency difference limen (F0DL) was measured adaptively for synthetic vowel stimuli. In the second task (time-varying F0), listeners relied on variations in F0 to judge intonation of synthetic diphthongs. For both tasks, three processing conditions were created: eight-channel vocoding that preserved periodicity cues to F0; a simulated electroacoustic stimulation condition, which consisted of high-frequency vocoder processing combined with a low-pass-filtered portion, and offered both periodicity and fine-structure cues to F0; and an unprocessed condition. RESULTS F0 difference limens for steady state vowel sounds and the ability to discern rising and falling intonations were significantly worse in the older subjects compared with the younger subjects. For both older and younger listeners, scores were lowest for the vocoded condition, and there was no difference in scores between the unprocessed and electroacoustic simulation conditions. CONCLUSIONS Older listeners had difficulty using periodicity cues to obtain information related to talker fundamental frequency. However, performance was improved by combining periodicity cues with (low frequency) acoustic information, and that strategy should be considered in individuals who are appropriate candidates for such processing. For cochlear implant candidates, this effect might be achieved by partial electrode insertion providing acoustic stimulation in the low frequencies or by the combination of a traditional implant in one ear and a hearing aid in the opposite ear.
Collapse
|
30
|
Massida Z, Belin P, James C, Rouger J, Fraysse B, Barone P, Deguine O. Voice discrimination in cochlear-implanted deaf subjects. Hear Res 2010; 275:120-9. [PMID: 21167924 DOI: 10.1016/j.heares.2010.12.010] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2010] [Revised: 12/08/2010] [Accepted: 12/09/2010] [Indexed: 11/26/2022]
Abstract
The human voice is important for social communication because voices carry speech and other information such as a person's physical characteristics and affective state. Further restricted temporal cortical regions are specifically involved in voice processing. In cochlear-implanted deaf patients, the processor alters the spectral cues which are crucial for the perception of the paralinguistic information of human voices. The aim of this study was to assess the abilities of voice discrimination in cochlear-implant (CI) users and in normal-hearing subjects (NHS) using a CI simulation (vocoder). In NHS the performance in voice discrimination decreased when reducing the spectral information by decreasing the number of channels of the vocoder. In CI patients with different delays after implantation we observed a strong impairment in voice discrimination at time of activation of the neuroprosthesis. No significant improvement can be detected in patients after two years of experience of the implant while they have reached a higher level of recovery of speech perception, suggesting a dissociation in the dynamic of functional recuperation of speech and voice processing. In addition to the lack of spectral cues due to the implant processor, we hypothesized that the origin of such deficit could derive from a crossmodal reorganization of the temporal voice areas in CI patients.
Collapse
Affiliation(s)
- Z Massida
- Université Toulouse, CerCo, Université Paul Sabatier, 133 route de Narbonne, 31062 Toulouse, France
| | | | | | | | | | | | | |
Collapse
|
31
|
Straatman LV, Rietveld ACM, Beijen J, Mylanus EAM, Mens LHM. Advantage of bimodal fitting in prosody perception for children using a cochlear implant and a hearing aid. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:1884-1895. [PMID: 20968360 DOI: 10.1121/1.3474236] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Cochlear implants are largely unable to encode voice pitch information, which hampers the perception of some prosodic cues, such as intonation. This study investigated whether children with a cochlear implant in one ear were better able to detect differences in intonation when a hearing aid was added in the other ear ("bimodal fitting"). Fourteen children with normal hearing and 19 children with bimodal fitting participated in two experiments. The first experiment assessed the just noticeable difference in F0, by presenting listeners with a naturally produced bisyllabic utterance with an artificially manipulated pitch accent. The second experiment assessed the ability to distinguish between questions and affirmations in Dutch words, again by using artificial manipulation of F0. For the implanted group, performance significantly improved in each experiment when the hearing aid was added. However, even with a hearing aid, the implanted group required exaggerated F0 excursions to perceive a pitch accent and to identify a question. These exaggerated excursions are close to the maximum excursions typically used by Dutch speakers. Nevertheless, the results of this study showed that compared to the implant only condition, bimodal fitting improved the perception of intonation.
Collapse
Affiliation(s)
- L V Straatman
- Department of Otorhinolaryngology, Head and Neck Surgery, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB Nijmegen, The Netherlands.
| | | | | | | | | |
Collapse
|
32
|
Cochlear implant melody recognition as a function of melody frequency range, harmonicity, and number of electrodes. Ear Hear 2010; 30:160-8. [PMID: 19194298 DOI: 10.1097/aud.0b013e31819342b9] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE The primary goal of the present study was to determine how cochlear implant melody recognition was affected by the frequency range of the melodies, the harmonicity of these melodies, and the number of activated electrodes. The secondary goal was to investigate whether melody recognition and speech recognition were differentially affected by the limitations imposed by cochlear implant processing. DESIGN Four experiments were conducted. In the first experiment, 11 cochlear implant users used their clinical processors to recognize melodies of complex harmonic tones with their fundamental frequencies being in the low (104-262 Hz), middle (207-523 Hz), and high (414-1046 Hz) ranges. In the second experiment, melody recognition with pure tones was compared to melody recognition with complex harmonic tones in four subjects. In the third experiment, melody recognition was measured as a function of the number of electrodes in five subjects. In the fourth experiment, vowel and consonant recognition were measured as a function of the number of electrodes in the same five subjects who participated in the third experiment. RESULTS Frequency range significantly affected cochlear implant melody recognition, with higher frequency ranges producing better performance. Pure tones produced significantly better performance than complex harmonic tones. Increasing the number of activated electrodes did not affect performance with low- and middle-frequency melodies but produced better performance with high-frequency melodies. Large individual variability was observed for melody recognition, but its source seemed to be different from the source of the large variability observed in speech recognition. CONCLUSION Contemporary cochlear implants do not adequately encode either temporal pitch or place pitch cues. Melody recognition and speech recognition require different signal processing strategies in future cochlear implants.
Collapse
|
33
|
Furness DN, Moore DR, Palmer AR, Summerfield Q. Abstracts of the British Society of Audiology Short Papers Meeting on Experimental Studies of Hearing and Deafness. Int J Audiol 2010. [DOI: 10.3109/14992027.2010.490242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
34
|
Trehub SE, Vongpaisal T, Nakata T. Music in the lives of deaf children with cochlear implants. Ann N Y Acad Sci 2009; 1169:534-42. [PMID: 19673836 DOI: 10.1111/j.1749-6632.2009.04554.x] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Present-day cochlear implants provide good temporal cues and coarse spectral cues. In general, these cues are adequate for perceiving speech in quiet backgrounds and for young children's acquisition of spoken language. They are inadequate, however, for conveying the rich pitch-patterning of music. As a result, many adults who become implant users after losing their hearing find music disappointing or unacceptable. By contrast, child implant users who were born deaf or became deaf as infants or toddlers typically find music interesting and enjoyable. They recognize popular songs that they hear regularly when the test materials match critical features of the original versions. For example, they can identify familiar songs from the original recordings with words and from versions that omit the words but preserve all other cues. They also recognize theme songs from their favorite television programs when presented in original or somewhat altered form. The motivation of children with implants for listening to music or melodious speech is evident well before they understand language. Within months after receiving their implant, they prefer singing to silence. They also prefer speech in the maternal style to typical adult speech and the sounds of their native language-to-be to those of a foreign language. An important task of future research is to ascertain the relative contributions of perceptual and motivational factors to the apparent differences between child and adult implant users.
Collapse
|
35
|
Souza P, Rosen S. Effects of envelope bandwidth on the intelligibility of sine- and noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:792-805. [PMID: 19640044 PMCID: PMC2730710 DOI: 10.1121/1.3158835] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The choice of processing parameters for vocoded signals may have an important effect on the availability of various auditory features. Experiment 1 varied envelope cutoff frequency (30 and 300 Hz), carrier type (sine and noise), and number of bands (2-5) for vocoded speech presented to normal-hearing listeners. Performance was better with a high cutoff for sine-vocoding, with no effect of cutoff for noise-vocoding. With a low cutoff, performance was better for noise-vocoding than for sine-vocoding. With a high cutoff, performance was better for sine-vocoding. Experiment 2 measured perceptibility of cues to voice pitch variations. A noise carrier combined with a high cutoff allowed intonation to be perceived to some degree but performance was best in high-cutoff sine conditions. A low cutoff led to poorest performance, regardless of carrier. Experiment 3 tested the relative contributions of co-modulation across bands and spectral density to improved performance with a sine carrier and high cutoff. Co-modulation across bands had no effect so it appears that sidebands providing a denser spectrum improved performance. These results indicate that carrier type in combination with envelope cutoff can alter the available cues in vocoded speech, factors which must be considered in interpreting results with vocoded signals.
Collapse
Affiliation(s)
- Pamela Souza
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Seattle, WA 98105, USA
| | | |
Collapse
|
36
|
Yuan M, Lee T, Yuen KCP, Soli SD, van Hasselt CA, Tong MCF. Cantonese tone recognition with enhanced temporal periodicity cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:327-337. [PMID: 19603889 DOI: 10.1121/1.3117447] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This study investigated the contributions of temporal periodicity cues and the effectiveness of enhancing these cues for Cantonese tone recognition in noise. A multichannel noise-excited vocoder was used to simulate speech processing in cochlear implants. Ten normal-hearing listeners were tested. Temporal envelope and periodicity cues (TEPCs) below 500 Hz were extracted from four frequency bands: 60-500, 500-1000, 1000-2000, and 2000-4000 Hz. The test stimuli were obtained by combining TEPC-modulated noise signals from individual bands. For periodicity enhancement, temporal fluctuations in the range 20-500 Hz were replaced by a sinusoid with frequency equal to the fundamental frequency of original speech. Tone identification experiments were carried out using disyllabic word carriers. Results showed that TEPCs from the two high-frequency bands were more important for tone identification than TEPCs from the low-frequency bands. The use of periodicity-enhanced TEPCs led to consistent improvement of tone identification accuracy. The improvement was more significant at low signal-to-noise ratios, and more noticeable for female than for male voices. Analysis of error distributions showed that the enhancement method reduced tone identification errors and did not show any negative effect on the recognition of segmental structures.
Collapse
Affiliation(s)
- Meng Yuan
- Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong.
| | | | | | | | | | | |
Collapse
|
37
|
Peng SC, Lu N, Chatterjee M. Effects of cooperating and conflicting cues on speech intonation recognition by cochlear implant users and normal hearing listeners. Audiol Neurootol 2009; 14:327-37. [PMID: 19372651 PMCID: PMC2715009 DOI: 10.1159/000212112] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2008] [Accepted: 12/22/2008] [Indexed: 11/19/2022] Open
Abstract
Cochlear implant (CI) recipients have only limited access to fundamental frequency (F0) information, and thus exhibit deficits in speech intonation recognition. For speech intonation, F0 serves as the primary cue, and other potential acoustic cues (e.g. intensity properties) may also contribute. This study examined the effects of cooperating or conflicting acoustic cues on speech intonation recognition by adult CI and normal hearing (NH) listeners with full-spectrum and spectrally degraded speech stimuli. Identification of speech intonation that signifies question and statement contrasts was measured in 13 CI recipients and 4 NH listeners, using resynthesized bi-syllabic words, where F0 and intensity properties were systematically manipulated. The stimulus set was comprised of tokens whose acoustic cues (i.e. F0 contour and intensity patterns) were either cooperating or conflicting. Subjects identified if each stimulus is a 'statement' or a 'question' in a single-interval, 2-alternative forced-choice (2AFC) paradigm. Logistic models were fitted to the data, and estimated coefficients were compared under cooperating and conflicting conditions, between the subject groups (CI vs. NH), and under full-spectrum and spectrally degraded conditions for NH listeners. The results indicated that CI listeners' intonation recognition was enhanced by cooperating F0 contour and intensity cues, but was adversely affected by these cues being conflicting. On the other hand, with full-spectrum stimuli, NH listeners' intonation recognition was not affected by cues being cooperating or conflicting. The effects of cues being cooperating or conflicting were comparable between the CI group and NH listeners with spectrally degraded stimuli. These findings suggest the importance of taking multiple acoustic sources for speech recognition into consideration in aural rehabilitation for CI recipients.
Collapse
Affiliation(s)
- Shu-Chen Peng
- Center for Device and Radiological Health, US Food and Drug Administration, Rockville, MD, USA.
| | | | | |
Collapse
|
38
|
Lin YS, Peng SC. Effects of frequency allocation on lexical tone identification by Mandarin-speaking children with a cochlear implant. Acta Otolaryngol 2009; 129:289-96. [PMID: 19132634 DOI: 10.1080/00016480701596047] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
CONCLUSION Frequency allocation with extended frequency ranges yielded significantly higher accuracy in pediatric CI recipients' lexical tone identification. These findings suggest that frequency allocation with extended frequency ranges may be useful in improving lexical tone recognition for at least some pediatric CI recipients. OBJECTIVES To assess the effects of frequency allocation on lexical tone identification by Mandarin-speaking children with a cochlear implant (CI). SUBJECTS AND METHODS In a prospective study, 15 prelingually deafened children between 7.17 and 16.17 years of age served as participants. Using Med-el CI devices, each participant's accuracy in lexical tone identification was compared in two conditions: first, the experimental condition, i.e. use of the extended frequency range from 233 to 8501 Hz; second, the control condition, i.e. use of the participant's clinically assigned frequency range from 300 to 8404 Hz. RESULTS The group mean of pediatric CI users' accuracy in lexical tone identification was 88.02% (SD = 6.31%) in the experimental condition and 83.82% (SD = 9.84%) in the control condition. The group mean was 4.20% (SD = 5.48%) higher in the experimental condition than that in the control condition; this difference was statistically significant (t(14) = 2.97, p=0.010).
Collapse
|
39
|
Abstract
OBJECTIVES This study investigated the identification of familiar environmental sounds with varying spectral resolution to establish (1) the number of frequency channels needed to perceive a large heterogeneous set of familiar environmental sounds, (2) the role of cross-channel asynchrony in identification performance, and (3) the acoustic correlates of the spectral resolution required for identification. DESIGN In experiment 1, 60 normal-hearing listeners identified environmental sounds in a 60-alternative closed--set response task as a function of six spectral resolution conditions (i.e., 2, 4, 8, 16, 24, and 32 frequency channels) obtained with an envelope-vocoder. In experiment 2, identification accuracy for varying amounts of cross-channel asynchrony was determined for sounds with preserved and degraded fine spectral structure in 10 normal-hearing listeners. Experiment 3 examined identification performance of 72 listeners across six spectral resolution conditions as in experiment 1, but using three different signal processing methods designed to minimize cross-channel asynchrony across channels. Follow-up acoustic and discriminant analyses were carried out to identify parameters that can distinguish environmental sounds based on required spectral resolution. RESULTS Identification accuracy tended to improve with increasing spectral resolution reaching the maximum of 76%. However, in experiment 1, performance did not change significantly beyond eight channels, whereas identification accuracy of some sounds declined with increasing spectral resolution. In experiment 2, increases in cross-channel asynchrony for sounds with preserved fine spectra had a small, but significant negative effect on identification. However, minimizing the amount of asynchrony had no significant effect on the overall identification of spectrally degraded sounds in experiment 3. Acoustic analysis indicated several spectral and temporal measures that differed significantly between sounds that required eight or fewer channels and those that required 16 or more channels for 70% correct identification. Discriminant analysis revealed that the sounds could be classified into high- and low-required spectral resolution groups with 83% accuracy based on only two acoustic parameters: the number of bursts in the envelope and the standard deviation of spectral centroid velocity. CONCLUSIONS Increasing spectral resolution generally had a positive effect on identification of familiar environmental sounds. However, across conditions performance accuracy remained well-below that of control stimuli with preserved fine spectra, despite becoming asymptotic above eight channels. Cross-channel asynchrony introduced during vocoder processing, although detrimental for some sounds, was not a major factor that prevented further improvement in overall accuracy. A spectral resolution greater than 32 channels, along with additional fine spectral and temporal information may be required for identification of a number of environmental sounds. This study provides a preliminary basis for optimizing environmental sound perception by cochlear implant users by highlighting the role of several acoustic factors important for environmental sound identification.
Collapse
|
40
|
Abstract
OBJECTIVES Tone production is particularly important for communicating in tone languages such as Mandarin Chinese. In the present study, an artificial neural network was used to recognize tones produced by adult native speakers. The purposes of the study were (1) to test the sensitivity of the neural network to speaker variation typically in adult speaker groups, (2) to evaluate two normalization procedures to overcome the effects of speaker variation, and (3) to compare tone recognition performance of the neural network with that of the human listeners. DESIGN A feedforward multilayer neural network was used. Twenty-nine adult native Mandarin Chinese speakers were recruited to record tone samples. The F0 contours of the vowel part of the 1044 monosyllabic words recorded were extracted using an autocorrelation method. Samples from the F0 contours were used as inputs to the neural network. The efficacy of the neural network was first tested by varying the number of inputs and the number of neurons in the hidden layer from 1 to 16. The sensitivity of the neural network to speaker variation was tested by (1) using the raw F0 data from speech tokens of a number of randomly drawn speakers that varied from 1 to 29, (2) using the raw F0 data from speech tokens of either male-only or female-only speakers, and (3) using two sets of normalized F0 data (i.e., tone 1-based normalization and first-order derivative) from speech tokens from a number of randomly drawn speakers that varied from 1 to 29. The recognition performance of the neural network under several experimental conditions was compared with the corresponding recognition performance of 10 normal-hearing, native Mandarin Chinese speaking adult listeners. RESULTS Three inputs and four hidden neurons were found to be sufficient for the neural network to perform at about 85% correct using speech samples without normalization. The performance of the neural network was affected by variation across speakers particularly between genders. Using the tone 1-based normalization procedure, the performance of the neural network improved significantly. The recognition accuracy of the neural network as a whole or for each tone was comparable with that of the human listeners. CONCLUSIONS The neural network can be used to evaluate the tone production of Mandarin Chinese speaking adults with human listener-like recognition accuracy. The tone 1-based normalization procedure improves the performance of the neural network to human listener-like accuracy. The success of our neural network in recognizing tones from multiple speakers supports its utility for evaluating tone production. Further testing of the neural network with hearing-impaired speakers might reveal its potential use for clinical evaluation of tone production.
Collapse
Affiliation(s)
- Ning Zhou
- School of Hearing, Speech and Language Sciences, Ohio University, Athens, Ohio
| | - Wenle Zhang
- School of Electrical Engineering and Computer Science, Ohio University, Athens, Ohio
| | - Chao-Yang Lee
- School of Hearing, Speech and Language Sciences, Ohio University, Athens, Ohio
| | - Li Xu
- School of Hearing, Speech and Language Sciences, Ohio University, Athens, Ohio
| |
Collapse
|
41
|
Production and Perception of Speech Intonation in Pediatric Cochlear Implant Recipients and Individuals with Normal Hearing. Ear Hear 2008; 29:336-51. [DOI: 10.1097/aud.0b013e318168d94d] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
42
|
Kuo YC, Rosen S, Faulkner A. Acoustic cues to tonal contrasts in Mandarin: implications for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2815. [PMID: 18529197 DOI: 10.1121/1.2896755] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The present study systematically manipulated three acoustic cues--fundamental frequency (f0), amplitude envelope, and duration--to investigate their contributions to tonal contrasts in Mandarin. Simplified stimuli with all possible combinations of these three cues were presented for identification to eight normal-hearing listeners, all native speakers of Mandarin from Taiwan. The f0 information was conveyed either by an f0-controlled sawtooth carrier or a modulated noise so as to compare the performance achievable by a clear indication of voice f0 and what is possible with purely temporal coding of f0. Tone recognition performance with explicit f0 was much better than that with any combination of other acoustic cues (consistently greater than 90% correct compared to 33%-65%; chance is 25%). In the absence of explicit f0, the temporal coding of f0 and amplitude envelope both contributed somewhat to tone recognition, while duration had only a marginal effect. Performance based on these secondary cues varied greatly across listeners. These results explain the relatively poor perception of tone in cochlear implant users, given that cochlear implants currently provide only weak cues to f0, so that users must rely upon the purely temporal (and secondary) features for the perception of tone.
Collapse
Affiliation(s)
- Yu-Ching Kuo
- Department of Special Education, Taipei Municipal University of Education, No. 1, Ai-Guo West Road, Taipei, 10042, Taiwan.
| | | | | |
Collapse
|
43
|
Chatterjee M, Peng SC. Processing F0 with cochlear implants: Modulation frequency discrimination and speech intonation recognition. Hear Res 2007; 235:143-56. [PMID: 18093766 DOI: 10.1016/j.heares.2007.11.004] [Citation(s) in RCA: 138] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2007] [Revised: 11/13/2007] [Accepted: 11/16/2007] [Indexed: 10/22/2022]
Abstract
Fundamental frequency (F0) processing by cochlear implant (CI) listeners was measured using a psychophysical task and a speech intonation recognition task. Listeners' Weber fractions for modulation frequency discrimination were measured using an adaptive, 3-interval, forced-choice paradigm: stimuli were presented through a custom research interface. In the speech intonation recognition task, listeners were asked to indicate whether resynthesized bisyllabic words, when presented in the free field through the listeners' everyday speech processor, were question-like or statement-like. The resynthesized tokens were systematically manipulated to have different initial-F0s to represent male vs. female voices, and different F0 contours (i.e. falling, flat, and rising) Although the CI listeners showed considerable variation in performance on both tasks, significant correlations were observed between the CI listeners' sensitivity to modulation frequency in the psychophysical task and their performance in intonation recognition. Consistent with their greater reliance on temporal cues, the CI listeners' performance in the intonation recognition task was significantly poorer with the higher initial-F0 stimuli than with the lower initial-F0 stimuli. Similar results were obtained with normal hearing listeners attending to noiseband-vocoded CI simulations with reduced spectral resolution.
Collapse
Affiliation(s)
- Monita Chatterjee
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD 20742, USA.
| | | |
Collapse
|
44
|
Peng SC, Tomblin JB, Spencer LJ, Hurtig RR. Imitative production of rising speech intonation in pediatric cochlear implant recipients. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2007; 50:1210-27. [PMID: 17905907 PMCID: PMC3212410 DOI: 10.1044/1092-4388(2007/085)] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
PURPOSE This study investigated the acoustic characteristics of pediatric cochlear implant (CI) recipients' imitative production of rising speech intonation, in relation to the perceptual judgments by listeners with normal hearing (NH). METHOD Recordings of a yes-no interrogative utterance imitated by 24 prelingually deafened children with a CI were extracted from annual evaluation sessions. These utterances were perceptually judged by adult NH listeners in regard with intonation contour type (non-rise, partial-rise, or full-rise) and contour appropriateness (on a 5-point scale). Fundamental frequency, intensity, and duration properties of each utterance were also acoustically analyzed. RESULTS Adult NH listeners' judgments of intonation contour type and contour appropriateness for each CI participant's utterances were highly positively correlated. The pediatric CI recipients did not consistently use appropriate intonation contours when imitating a yes-no question. Acoustic properties of speech intonation produced by these individuals were discernible among utterances of different intonation contour types according to NH listeners' perceptual judgments. CONCLUSIONS These findings delineated the perceptual and acoustic characteristics of speech intonation imitated by prelingually deafened children and young adults with a CI. Future studies should address whether the degraded signals these individuals perceive via a CI contribute to their difficulties with speech intonation production.
Collapse
|
45
|
Hamilton N, Green T, Faulkner A. Use of a single channel dedicated to conveying enhanced temporal periodicity cues in cochlear implants: effects on prosodic perception and vowel identification. Int J Audiol 2007; 46:244-53. [PMID: 17487672 DOI: 10.1080/14992020601053340] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The continuous interleaved sampling (CIS) strategy for cochlear implants has well-established limitations for the perception of pitch changes in speech. This study investigated a modification of CIS in which one channel was dedicated to the transmission of a temporal encoding of fundamental frequency (F0). Normal hearing subjects listening to noise-excited vocoders, and implantees were tested on labelling the pitch movement of diphthongal glides, on using intonation information to identify sentences as question or statement, and on vowel recognition. There were no significant differences between modified processing and CIS in vowel recognition. However, while there was limited evidence of improved pitch perception relative to CIS with simplified F0 modulation applied to the most basal channel, in general it appears that for most implant users, restricting F0-related modulation to one channel does not provide significantly enhanced pitch information.
Collapse
Affiliation(s)
- Nicholas Hamilton
- Department of Phonetics and Linguistics, University College London, UK
| | | | | |
Collapse
|
46
|
Green T, Katiri S, Faulkner A, Rosen S. Talker intelligibility differences in cochlear implant listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:EL223-9. [PMID: 17552573 DOI: 10.1121/1.2720938] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
People vary in the intelligibility of their speech. This study investigated whether across-talker intelligibility differences observed in normally-hearing listeners are also found in cochlear implant (CI) users. Speech perception for male, female, and child pairs of talkers differing in intelligibility was assessed with actual and simulated CI processing and in normal hearing. While overall speech recognition was, as expected, poorer for CI users, differences in intelligibility across talkers were consistent across all listener groups. This suggests that the primary determinants of intelligibility differences are preserved in the CI-processed signal, though no single critical acoustic property could be identified.
Collapse
Affiliation(s)
- Tim Green
- Department of Phonetics and Linguistics, University College London, 4 Stephenson Way, London NW1 2HE, United Kingdom
| | | | | | | |
Collapse
|
47
|
Lin YS, Lee FP, Huang IS, Peng SC. Continuous improvement in Mandarin lexical tone perception as the number of channels increased: a simulation study of cochlear implant. Acta Otolaryngol 2007; 127:505-14. [PMID: 17453477 DOI: 10.1080/00016480600951434] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
CONCLUSION With reference to English phoneme recognition, where performance usually does not improve after six or eight channels in cochlear implants (CIs), increasing total channel numbers continuously improved perception of Mandarin tones. OBJECTIVE To test our hypothesis that current CI strategies might be modified to improve Mandarin lexical tonal perception. MATERIALS AND METHODS Lexical tonal perception tests using 48 monosyllables in Mandarin Chinese were conducted in 32 native Mandarin speakers with normal hearing. The performance of tonal perception was compared among the controlled factors, which were total channel number, number of channels allocated to the F0 spectrum, and whether there were spectral shifts in the electrode configuration. The experimental condition that preserves fine structure was used as a comparison. RESULTS The signal processing strategy using 16 channels--which is technically possible with current CI devices--produced better tonal perception than those using 12 or 8 channels. Increasing the number of fundamental channels did not improve tonal perception, and spectral shifts did not change tonal perception. An experimental condition (FiC12) that preserves the fine structure produced significantly better overall scores for tone perception than other experimental conditions with envelope strategies.
Collapse
Affiliation(s)
- Yung-Song Lin
- Department of Otolaryngology, Taipei Medical University, Chi Mei Medical Center, Tainan city, Taiwan, ROC.
| | | | | | | |
Collapse
|
48
|
Qin MK, Oxenham AJ. Effects of introducing unprocessed low-frequency information on the reception of envelope-vocoder processed speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:2417-26. [PMID: 16642854 DOI: 10.1121/1.2178719] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
This study investigated the benefits of adding unprocessed low-frequency information to acoustic simulations of cochlear-implant processing in normal-hearing listeners. Implant processing was simulated using an eight-channel noise-excited envelope vocoder, and low-frequency information was added by replacing the lower frequency channels of the processor with a low-pass-filtered version of the original stimulus. Experiment 1 measured sentence-level speech reception as a function of target-to-masker ratio, with either steady-state speech-shaped noise or single-talker maskers. Experiment 2 measured listeners' ability to identify two vowels presented simultaneously, as a function of the F0 difference between the two vowels. In both experiments low-frequency information was added below either 300 or 600 Hz. The introduction of the additional low-frequency information led to substantial and significant improvements in performance in both experiments, with a greater improvement observed for the higher (600 Hz) than for the lower (300 Hz) cutoff frequency. However, performance never equaled performance in the unprocessed conditions. The results confirm other recent demonstrations that added low-frequency information can provide significant benefits in intelligibility, which may at least in part be attributed to improvements in F0 representation. The findings provide further support for efforts to make use of residual acoustic hearing in cochlear-implant users.
Collapse
Affiliation(s)
- Michael K Qin
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | |
Collapse
|
49
|
Qin MK, Oxenham AJ. Effects of envelope-vocoder processing on F0 discrimination and concurrent-vowel identification. Ear Hear 2006; 26:451-60. [PMID: 16230895 DOI: 10.1097/01.aud.0000179689.79868.06] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
OBJECTIVE The aim of this study was to examine the effects of envelope-vocoder sound processing on listeners' ability to discriminate changes in fundamental frequency (F0) in anechoic and reverberant conditions and on their ability to identify concurrent vowels based on differences in F0. DESIGN In the first experiment, F0 difference limens (F0DLs) were measured as a function of number of envelope-vocoder frequency channels (1, 4, 8, 24, and 40 channels, and unprocessed) in four normal-hearing listeners, with degree of simulated reverberation (no, mild, and severe reverberation) as a parameter. In the second experiment, vowel identification was measured as a function of the F0 difference between two simultaneous vowels in six normal-hearing listeners, with the number of vocoder channels (8 and 24 channels, and unprocessed) as a parameter. RESULTS Reverberation was detrimental to F0 discrimination in conditions with fewer numbers of vocoder channels. Despite the reasonable F0DLs (<1 semitone) with 24- and 8-channel vocoder processing, listeners were unable to benefit from F0 differences between the competing vowels in the concurrent-vowel paradigm. CONCLUSIONS The overall detrimental effects of vocoder processing are probably are due to the poor spectral representation of the lower-order harmonics. The F0 information carried in the temporal envelope is weak, susceptible to reverberation, and may not suffice for source segregation. To the extent that vocoder processing simulates cochlear implant processing, users of current implant processing schemes are unlikely to benefit from F0 differences between competing talkers when listening to speech in complex environments. The results provide further incentive for finding a way to make the information from low-order, resolved harmonics available to cochlear implant users.
Collapse
Affiliation(s)
- Michael K Qin
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
| | | |
Collapse
|
50
|
Poissant SF, Whitmal NA, Freyman RL. Effects of reverberation and masking on speech intelligibility in cochlear implant simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:1606-15. [PMID: 16583905 DOI: 10.1121/1.2168428] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Two experiments investigated the impact of reverberation and masking on speech understanding using cochlear implant (CI) simulations. Experiment 1 tested sentence recognition in quiet. Stimuli were processed with reverberation simulation (T=0.425, 0.266, 0.152, and 0.0 s) and then either processed with vocoding (6, 12, or 24 channels) or were subjected to no further processing. Reverberation alone had only a small impact on perception when as few as 12 channels of information were available. However, when the processing was limited to 6 channels, perception was extremely vulnerable to the effects of reverberation. In experiment 2, subjects listened to reverberated sentences, through 6- and 12-channel processors, in the presence of either speech-spectrum noise (SSN) or two-talker babble (TTB) at various target-to-masker ratios. The combined impact of reverberation and masking was profound, although there was no interaction between the two effects. This differs from results obtained in subjects listening to unprocessed speech where interactions between reverberation and masking have been shown to exist. A speech transmission index (STI) analysis indicated a reasonably good prediction of speech recognition performance. Unlike previous investigations, the SSN and TTB maskers produced equivalent results, raising questions about the role of informational masking in CI processed speech.
Collapse
Affiliation(s)
- Sarah F Poissant
- Communication Disorders Department University of Massachusetts Amherst, 6 Arnold House, Amherst, Massachusetts 01003, USA.
| | | | | |
Collapse
|