51
|
Place pitch versus electrode location in a realistic computational model of the implanted human cochlea. Hear Res 2014; 315:10-24. [PMID: 24975087 DOI: 10.1016/j.heares.2014.06.003] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Revised: 06/06/2014] [Accepted: 06/15/2014] [Indexed: 11/23/2022]
Abstract
Place pitch was investigated in a computational model of the implanted human cochlea containing nerve fibres with realistic trajectories that take the variable distance between the organ of Corti and spiral ganglion into account. The model was further updated from previous studies by including fluid compartments in the modiolus and updating the electrical conductivity values of (temporal) bone and the modiolus, based on clinical data. Four different cochlear geometries are used, modelled with both lateral and perimodiolar implants, and their neural excitation patterns were examined for nerve fibres modelled with and without peripheral processes. Additionally, equations were derived from the model geometries that describe Greenwood's frequency map as a function of cochlear angle at the basilar membrane as well as at the spiral ganglion. The main findings are: (I) in the first (basal) turn of the cochlea, cochlear implant induced pitch can be predicted fairly well using the Greenwood function. (II) Beyond the first turn this pitch becomes increasingly unpredictable, greatly dependent on stimulus level, state of the cochlear neurons and the electrode's distance from the modiolus. (III) After the first turn cochlear implant induced pitch decreases as stimulus level increases, but the pitch does not reach values expected from direct spiral ganglion stimulation unless the peripheral processes are missing. (IV) Electrode contacts near the end of the spiral ganglion or deeper elicit very unpredictable pitch, with broad frequency ranges that strongly overlap with those of neighbouring contacts. (V) The characteristic place pitch for stimulation at either the organ of Corti or the spiral ganglion can be described as a function of cochlear angle by the equations presented in this paper.
Collapse
|
52
|
Melody recognition in dichotic listening with or without frequency-place mismatch. Ear Hear 2013; 35:379-82. [PMID: 24351609 DOI: 10.1097/aud.0000000000000013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES The purpose of the study was to examine recognition of degraded melodic stimuli in dichotic listening with or without frequency-place mismatch. DESIGN Melodic stimuli were noise vocoded with various number-of-channel conditions in a dichotic and monaural processor. In the dichotic zipper processor, the odd-indexed channels were tonotopically matched and presented to the left ear while the even-indexed channels were tonotopically matched or upward shifted in frequency and presented to the right ear. In the monaural processor, all channels either unshifted or shifted were presented to the left ear alone. Familiar melody recognition was measured in 16 normal-hearing adult listeners. RESULTS Performance for dichotically presented melodic stimuli did not differ from that for monaurally presented stimuli even with low spectral resolution (8 channels). With spectral shift introduced in one ear, melody recognition decreased with increasing spectral shift in a nonmonotonic fashion. With spectral shift, melody recognition in dichotic listening was either not different or superior in a few cases relative to the monaural condition. CONCLUSIONS With no spectral shift, cohesive fusion of dichotically presented melodic stimuli did not seem to depend on spectral resolution. In spectrally shifted conditions, listeners may have suppressed the partially shifted channels in the right ear and selectively attended only to the unshifted ones, resulting in dichotic advantages for melody recognition in some cases.
Collapse
|
53
|
Winn MB, Rhone AE, Chatterjee M, Idsardi WJ. The use of auditory and visual context in speech perception by listeners with normal hearing and listeners with cochlear implants. Front Psychol 2013; 4:824. [PMID: 24204359 PMCID: PMC3817459 DOI: 10.3389/fpsyg.2013.00824] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Accepted: 10/17/2013] [Indexed: 11/13/2022] Open
Abstract
There is a wide range of acoustic and visual variability across different talkers and different speaking contexts. Listeners with normal hearing (NH) accommodate that variability in ways that facilitate efficient perception, but it is not known whether listeners with cochlear implants (CIs) can do the same. In this study, listeners with NH and listeners with CIs were tested for accommodation to auditory and visual phonetic contexts created by gender-driven speech differences as well as vowel coarticulation and lip rounding in both consonants and vowels. Accommodation was measured as the shifting of perceptual boundaries between /s/ and /∫/ sounds in various contexts, as modeled by mixed-effects logistic regression. Owing to the spectral contrasts thought to underlie these context effects, CI listeners were predicted to perform poorly, but showed considerable success. Listeners with CIs not only showed sensitivity to auditory cues to gender, they were also able to use visual cues to gender (i.e., faces) as a supplement or proxy for information in the acoustic domain, in a pattern that was not observed for listeners with NH. Spectrally-degraded stimuli heard by listeners with NH generally did not elicit strong context effects, underscoring the limitations of noise vocoders and/or the importance of experience with electric hearing. Visual cues for consonant lip rounding and vowel lip rounding were perceived in a manner consistent with coarticulation and were generally used more heavily by listeners with CIs. Results suggest that listeners with CIs are able to accommodate various sources of acoustic variability either by attending to appropriate acoustic cues or by inferring them via the visual signal.
Collapse
Affiliation(s)
- Matthew B Winn
- Waisman Center & Department of Surgery, University of Wisconsin-Madison , Madison, WI, USA
| | | | | | | |
Collapse
|
54
|
van Besouw RM, Forrester L, Crowe ND, Rowan D. Simulating the effect of interaural mismatch in the insertion depth of bilateral cochlear implants on speech perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:1348-1357. [PMID: 23927131 DOI: 10.1121/1.4812272] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
A bilateral advantage for diotically presented stimuli has been observed for cochlear implant (CI) users and is suggested to be dependent on symmetrical implant performance. Studies using CI simulations have not shown a true "bilateral" advantage, but a "better ear" effect and have demonstrated that performance decreases with increasing basalward shift in insertion depth. This study aimed to determine whether there is a bilateral advantage for CI simulations with interaurally matched insertions and the extent to which performance is affected by interaural insertion depth mismatch. Speech perception in noise and self-reported ease of listening were measured using matched bilateral, mismatched bilateral and unilateral CI simulations over four insertion depths for seventeen normal hearing listeners. Speech scores and ease of listening reduced with increasing basalward shift in (interaurally matched) insertion depth. A bilateral advantage for speech perception was only observed when the insertion depths were interaurally matched and deep. No advantage was observed for small to moderate interaural insertion-depth mismatches, consistent with a better ear effect. Finally, both measures were poorer than expected for a better ear effect for large mismatches, suggesting that misalignment of the electrode arrays may prevent a bilateral advantage and detrimentally affect perception of diotically presented speech.
Collapse
Affiliation(s)
- Rachel M van Besouw
- Institute of Sound and Vibration Research, University of Southampton, Southampton, Hampshire, SO17 1BJ, United Kingdom.
| | | | | | | |
Collapse
|
55
|
Newman R, Chatterjee M. Toddlers' recognition of noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:483-94. [PMID: 23297920 PMCID: PMC3548833 DOI: 10.1121/1.4770241] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2011] [Revised: 11/10/2012] [Accepted: 11/14/2012] [Indexed: 05/15/2023]
Abstract
Despite their remarkable clinical success, cochlear-implant listeners today still receive spectrally degraded information. Much research has examined normally hearing adult listeners' ability to interpret spectrally degraded signals, primarily using noise-vocoded speech to simulate cochlear implant processing. Far less research has explored infants' and toddlers' ability to interpret spectrally degraded signals, despite the fact that children in this age range are frequently implanted. This study examines 27-month-old typically developing toddlers' recognition of noise-vocoded speech in a language-guided looking study. Children saw two images on each trial and heard a voice instructing them to look at one item ("Find the cat!"). Full-spectrum sentences or their noise-vocoded versions were presented with varying numbers of spectral channels. Toddlers showed equivalent proportions of looking to the target object with full-speech and 24- or 8-channel noise-vocoded speech; they failed to look appropriately with 2-channel noise-vocoded speech and showed variable performance with 4-channel noise-vocoded speech. Despite accurate looking performance for speech with at least eight channels, children were slower to respond appropriately as the number of channels decreased. These results indicate that 2-yr-olds have developed the ability to interpret vocoded speech, even without practice, but that doing so requires additional processing. These findings have important implications for pediatric cochlear implantation.
Collapse
Affiliation(s)
- Rochelle Newman
- Department of Hearing and Speech Sciences, 0100 Lefrak Hall, University of Maryland, College Park, Maryland 20742, USA.
| | | |
Collapse
|
56
|
Peng SC, Chatterjee M, Lu N. Acoustic cue integration in speech intonation recognition with cochlear implants. Trends Amplif 2012; 16:67-82. [PMID: 22790392 PMCID: PMC3560417 DOI: 10.1177/1084713812451159] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The present article reports on the perceptual weighting of prosodic cues in question-statement identification by adult cochlear implant (CI) listeners. Acoustic analyses of normal-hearing (NH) listeners' production of sentences spoken as questions or statements confirmed that in English the last bisyllabic word in a sentence carries the dominant cues (F0, duration, and intensity patterns) for the contrast. Furthermore, these analyses showed that the F0 contour is the primary cue for the question-statement contrast, with intensity and duration changes conveying important but less reliable information. On the basis of these acoustic findings, the authors examined adult CI listeners' performance in two question-statement identification tasks. In Task 1, 13 CI listeners' question-statement identification accuracy was measured using naturally uttered sentences matched for their syntactic structures. In Task 2, the same listeners' perceptual cue weighting in question-statement identification was assessed using resynthesized single-word stimuli, within which fundamental frequency (F0), intensity, and duration properties were systematically manipulated. Both tasks were also conducted with four NH listeners with full-spectrum and noise-band-vocoded stimuli. Perceptual cue weighting was assessed by comparing the estimated coefficients in logistic models fitted to the data. Of the 13 CI listeners, 7 achieved high performance levels in Task 1. The results of Task 2 indicated that multiple sources of acoustic cues for question-statement identification were utilized to different extents depending on the listening conditions (e.g., full spectrum vs. spectrally degraded) or the listeners' hearing and amplification status (e.g., CI vs. NH).
Collapse
Affiliation(s)
- Shu-Chen Peng
- Division of Ophthalmic, Neurological, and Ear, Nose and Throat Devices, Office of Device Evaluation, U.S. Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993, USA.
| | | | | |
Collapse
|
57
|
Lee FP, Hsu HT, Lin YS, Hung SC. Effects of the electrode location on tonal discrimination and speech perception of Mandarin-speaking patients with a cochlear implant. Laryngoscope 2012; 122:1366-78. [PMID: 22569966 DOI: 10.1002/lary.23313] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2011] [Revised: 02/24/2012] [Accepted: 02/27/2012] [Indexed: 11/07/2022]
Abstract
OBJECTIVES/HYPOTHESIS This study assessed the effects of varying the electrode location on tonal discrimination and speech perception in Mandarin Chinese-speaking patients. STUDY DESIGN A controlled study with six experimental conditions. METHODS Seven Mandarin-speaking listeners who received a MED-EL cochlear implant (CI), ranging in age from 12.88 to 36.43 years (mean, 25.51 years), with an average of 5.28 years of device experience, participated this study. To evaluate the effects of electrode location, six experimental conditions each with the switch off at six different electrodes were designed. Identification tests of Mandarin lexical tones and words were performed. RESULTS Among experimental conditions with electrode lengths of 31, 23.8, and 16.6 mm, the CI subjects exhibited improved vowel and consonant identification in the condition of 31 mm, reflecting the apical location of electrodes. Specifically, the improvement was observed in the identification score for the vowel backness and height, as well as for the consonant place of articulation. Comparison among three settings with a same electrode length of 12.6 mm and the setting with stimulation to the midregion of the cochlea produces better words as well as the vowel and consonant identification compared with stimulation to basal and apical regions. However, no significant difference was observed for the lexical tone identification among conditions with different electrode location and stimulating region. CONCLUSIONS Less mismatch of the frequency-to-place alignment may account for the improvement of word identification in conditions with electrodes coverage to more apical location; and in conditions where the mid-region of the cochlea were stimulated.
Collapse
Affiliation(s)
- Fei-Peng Lee
- Department of Otolaryngology, Wan Fang Hospital, Taipei, Taiwan
| | | | | | | |
Collapse
|
58
|
Macherey O, Carlyon RP. Place-pitch manipulations with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:2225-36. [PMID: 22423718 PMCID: PMC3383798 DOI: 10.1121/1.3677260] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Pitch can be conveyed to cochlear implant listeners via both place of excitation and temporal cues. The transmission of place cues may be hampered by several factors, including limitations on the insertion depth and number of implanted electrodes, and the broad current spread produced by monopolar stimulation. The following series of experiments investigate several methods to partially overcome these limitations. Experiment 1 compares two recently published techniques that aim to activate more apical fibers than produced by monopolar or bipolar stimulation of the most apical contacts. The first technique (phantom stimulation) manipulates the current spread by simultaneously stimulating two electrodes with opposite-polarity pulses of different amplitudes. The second technique manipulates the neural spread of excitation by using asymmetric pulses and exploiting the polarity-sensitive properties of auditory nerve fibers. The two techniques yielded similar results and were shown to produce lower place-pitch percepts than stimulation of monopolar and bipolar symmetric pulses. Furthermore, combining these two techniques may be advantageous in a clinical setting. Experiment 2 proposes a method to create place pitches intermediate to those produced by physical electrodes by using charge-balanced asymmetric pulses in bipolar mode with different degrees of asymmetry.
Collapse
Affiliation(s)
- Olivier Macherey
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, United Kingdom.
| | | |
Collapse
|
59
|
Winn MB, Chatterjee M, Idsardi WJ. The use of acoustic cues for phonetic identification: effects of spectral degradation and electric hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:1465-1479. [PMID: 22352517 PMCID: PMC3292615 DOI: 10.1121/1.3672705] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Revised: 10/10/2011] [Accepted: 12/05/2011] [Indexed: 05/30/2023]
Abstract
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.
Collapse
Affiliation(s)
- Matthew B Winn
- Department of Hearing and Speech Sciences, University of Maryland, College Park, 0100 Lefrak Hall, College Park, Maryland 20742, USA.
| | | | | |
Collapse
|
60
|
Välimaa TT, Sorri MJ, Laitakari J, Sivonen V, Muhli A. Vowel confusion patterns in adults during initial 4 years of implant use. CLINICAL LINGUISTICS & PHONETICS 2011; 25:121-144. [PMID: 21070135 DOI: 10.3109/02699206.2010.514692] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
This study investigated adult cochlear implant users' (n = 39) vowel recognition and confusions by an open-set syllable test during 4 years of implant use, in a prospective repeated-measures design. Subjects' responses were coded for phoneme errors and estimated by the generalized mixed model. Improvement in overall vowel recognition was highest during the first 6 months, showing statistically significant change until 4 years, especially for the mediocre performers. The best performers improved statistically significantly until 18 months. The poorest performers improved until 12 months and exhibited more vowel confusions. No differences were found in overall vowel recognition between Nucleus24M/24R and Med-ElC40+ device users (matched comparison), but certain vowels showed statistically significant differences. Vowel confusions between adjacent vowels were evident, probably due to the implant users' inability to discriminate formant frequencies. Vowel confusions were also dominated by vowels whose average F1 and/or F2 frequencies were higher than the target vowel, indicating a basalward shift in the confusions.
Collapse
Affiliation(s)
- Taina T Välimaa
- Faculty of Humanities, Logopedics, and Department of Otorhinolaryngology. Oulu University Hospital, University of Oulu, Finland.
| | | | | | | | | |
Collapse
|
61
|
Use of "phantom electrode" technique to extend the range of pitches available through a cochlear implant. Ear Hear 2011; 31:693-701. [PMID: 20467321 DOI: 10.1097/aud.0b013e3181e1d15e] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE The range of pitch sensations available in cochlear implants (CIs) is conventionally thought to be limited by the location of the most apical and basal electrodes. However, partial bipolar stimulation, in which current is distributed to two intracochlear electrodes and one extracochlear electrode, can produce "phantom electrode" (PE) pitch percepts that extend beyond the pitch range available with physical electrodes. The goals of this study were (1) to determine the PE configuration that generated the lowest pitch relative to monopolar (MP) stimulation of the most apical electrode and (2) to determine the amount of pitch shift produced by different PE configurations. DESIGN Ten Advanced Bionics CI users (9 unilateral and 1 bilateral), implanted with the CII or HiRes 90k implant and the HiFocus 1, HiFocus 1j, or Helix electrode arrays participated in this study. PEs were created by simultaneously stimulating the primary and compensating electrodes in opposite phase. To test different PE configurations, the proportion of current delivered to the compensating electrode (sigma) and the electrode separation between the primary and compensatory electrode (D) were varied. To estimate the relative pitch of PEs, the lowest pitched PEs with primary electrodes 4 and 8 were compared with subsets of MP electrodes (1, 2, 3, 4, 5 and 5, 6, 7, 8, 9, respectively). RESULTS In all subjects, it was possible to identify sigma and D values that produced a PE that was lower in pitch than the MP stimulation of the primary electrode. In some subjects, increasing sigma and/or D produced progressively lower pitch percepts, whereas in others, PE pitch changed nonmonotonically with sigma and/or D. The amount of PE pitch shift could be estimated only for 14 cases; in seven cases, the pitch shift was <1 MP electrode, and in seven other cases, the pitch shift was between 1 and 2 MP electrodes. CONCLUSIONS PE stimulation can elicit pitch percepts lower than that of the most apical MP electrode; the PE pitch is lower by the equivalent of 0.5 to 2 MP electrodes.
Collapse
|
62
|
|
63
|
Qi B, Liu B, Krenmayr A, Liu S, Gong S, Liu H, Zhang N, Han D. The contribution of apical stimulation to Mandarin speech perception in users of the MED-EL COMBI 40+ cochlear implant. Acta Otolaryngol 2011; 131:52-8. [PMID: 20863152 DOI: 10.3109/00016489.2010.506652] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
CONCLUSION Not stimulating the apical cochlear region in tonal language speaking cochlear implantees significantly reduces discrimination of Mandarin vowels. The data presented here suggest that electrode arrays that allow complete cochlear coverage with stimulation pulses seem to be preferable over shorter arrays for use in cochlear implant (CI) indications. OBJECTIVE To assess the contribution of electrical stimulation beyond the first cochlear turn on tonal language speech perception. METHODS Twelve Mandarin-speaking users of the MED-EL COMBI 40+ cochlear implant with complete insertion of the standard COMBI 40+ electrode array participated in the study. Acute speech tests were performed in seven electrode configurations with stimulation either distributed over the whole length of the cochlea or restricted to the apical, middle or basal regions. The test battery comprised tone, consonant, and vowel identification in quiet as well as a sentence recognition task in quiet and noise. RESULTS While neither tone nor consonant identification depended crucially on the placement of the active electrodes, vowel identification and sentence recognition decreased significantly when the four apical electrodes were not stimulated.
Collapse
Affiliation(s)
- Beier Qi
- Beijing Tong Ren Hospital, Capital Medical University, Beijing Institute of Otolaryngology, Ministry of Education, China
| | | | | | | | | | | | | | | |
Collapse
|
64
|
Interactions between unsupervised learning and the degree of spectral mismatch on short-term perceptual adaptation to spectrally shifted speech. Ear Hear 2010; 30:238-49. [PMID: 19194293 DOI: 10.1097/aud.0b013e31819769ac] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Cochlear implant listeners are able to at least partially adapt to the spectral mismatch associated with the implant device and speech processor via daily exposure and/or explicit training. The overall goal of this study was to investigate interactions between short-term unsupervised learning (i.e., passive adaptation) and the degree of spectral mismatch in normal-hearing listeners' adaptation to spectrally shifted vowels. DESIGN Normal-hearing subjects were tested while listening to acoustic cochlear implant simulations. Unsupervised learning was measured by testing vowel recognition repeatedly over a 5 day period; no feedback or explicit training was provided. In experiment 1, subjects listened to 8-channel, sine-wave vocoded speech. The spectral envelope was compressed to simulate a 16 mm cochlear implant electrode array. The analysis bands were fixed and the compressed spectral envelope was linearly shifted toward the base by 3.6, 6, or 8.3 mm to simulate different insertion depths of the electrode array, resulting in a slight, moderate, or severe spectral shift. In experiment 2, half the subjects were exclusively exposed to a severe shift with 8 or 16 channels (exclusive groups), and half the subjects were exposed to 8-channel severely shifted speech, 16-channel severely shifted speech, and 8-channel moderately shifted speech, alternately presented within each test session (mixed group). The region of stimulation in the cochlea was fixed (16 mm in extent and 15 mm from the apex) and the analysis bands were manipulated to create the spectral shift conditions. To determine whether increased spectral resolution would improve adaptation, subjects were exposed to 8- or 16-channel severely shifted speech. RESULTS In experiment 1, at the end of the adaptation period, there was no significant difference between 8-channel speech that was spectrally matched and that shifted by 3.6 mm. There was a significant, but less-complete, adaptation to the 6 mm shift and no adaptation to the 8.3 mm shift. In experiment 2, for the mixed exposure group, there was significant adaptation to severely shifted speech with 8 channels and even greater adaptation with 16 channels. For the exclusive exposure group, there was no significant adaptation to severely shifted speech with either 8 or 16 channels. CONCLUSIONS These findings suggest that listeners are able to passively adapt to spectral shifts up to 6 mm. For spectral shifts beyond 6 mm, some passive adaptation was observed with mixed exposure to a smaller spectral shift, even at the expense of some low frequency information. Mixed exposure to the smaller shift may have enhanced listeners' access to spectral envelope details that were not accessible when listening exclusively to severely shifted speech. The results suggest that the range of spectral mismatch that can support passive adaptation may be larger than previously reported. Some amount of passive adaptation may be possible with severely shifted speech by exposing listeners to a relatively small mismatch in conjunction with the severe mismatch.
Collapse
|
65
|
Li T, Fu QJ. Effects of spectral shifting on speech perception in noise. Hear Res 2010; 270:81-8. [PMID: 20868733 DOI: 10.1016/j.heares.2010.09.005] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/03/2009] [Revised: 09/08/2010] [Accepted: 09/14/2010] [Indexed: 11/16/2022]
Abstract
The present study used eight normal-hearing (NH) subjects, listening to acoustic cochlear implant (CI) simulations, to examine the effects of spectral shifting on speech recognition in noise. Speech recognition was measured using spectrally matched and shifted speech (vowels, consonants, and IEEE sentences), generated by 8-channel, sine-wave vocoder. Measurements were made in quiet and in noise (speech-shaped static noise and speech-babble at 5 dB signal-to-noise ratio). One spectral match condition and four spectral shift conditions were investigated: 2 mm, 3 mm, and 4 mm linear shift, and 3 mm shift with compression, in terms of cochlear distance. Results showed that speech recognition scores dropped because of noise and spectral shifting, and that the interactive effects of spectral shifting and background conditions depended on the degree/type of spectral shift, background conditions, and the speech test materials. There was no significant interaction between spectral shifting and two noise conditions for all speech test materials. However, significant interactions between linear spectral shifts and all background conditions were found in sentence recognition; significant interactions between spectral shift types and all background conditions were found in vowel recognition. Overall, the results suggest that tonotopic mismatch may affect performance of CI users in complex listening environments.
Collapse
Affiliation(s)
- Tianhao Li
- Division of Communication and Auditory Neuroscience, House Ear Institute, Los Angeles, CA 90057, USA.
| | | |
Collapse
|
66
|
Improving melody recognition in cochlear implant recipients through individualized frequency map fitting. Eur Arch Otorhinolaryngol 2010; 268:27-39. [DOI: 10.1007/s00405-010-1335-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Accepted: 07/02/2010] [Indexed: 10/19/2022]
|
67
|
Zhou N, Xu L, Lee CY. The effects of frequency-place shift on consonant confusion in cochlear implant simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:401-9. [PMID: 20649234 PMCID: PMC2921437 DOI: 10.1121/1.3436558] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2009] [Revised: 04/26/2010] [Accepted: 05/04/2010] [Indexed: 05/26/2023]
Abstract
The effects of frequency-place shift on consonant recognition and confusion matrices were examined. Frequency-place shift was manipulated using a noise-excited vocoder with 4 to 16 channels. In the vocoder processing, the location of the most apical carrier band varied from the matched condition (i.e., 28 mm from the base of the cochlear) to a basal shift (i.e., 22 mm from the base) in a step size of 1 mm. Ten normal-hearing subjects participated in the 20-alternative forced-choice test, where the consonants were presented in a /Ca/ context. Shift of 3 mm or more caused the consonant recognition scores to decrease significantly. The effects of spectral resolution disappeared when the amount of shift reached >or=3 mm. Information transmitted for voicing and place of articulation varied with spectral shift and spectral resolution, while information transmitted for manner was affected only by spectral shift but not spectral resolution. Spectral shift has shown specific effects on the confusion patterns of the consonants. The direction of errors reversed as spectral shift increased and the patterns of reversal were consistent across channel conditions. Overall, transmission of the consonant features can be accounted for by the acoustic features of the speech signal.
Collapse
Affiliation(s)
- Ning Zhou
- School of Hearing, Speech and Language Sciences, Ohio University, Athens, Ohio 45701, USA
| | | | | |
Collapse
|
68
|
Siciliano CM, Faulkner A, Rosen S, Mair K. Resistance to learning binaurally mismatched frequency-to-place maps: implications for bilateral stimulation with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:1645-60. [PMID: 20329863 DOI: 10.1121/1.3293002] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Simulations of monaural cochlear implants in normal hearing listeners have shown that the deleterious effects of upward spectral shifting on speech perception can be overcome with training. This study simulates bilateral stimulation with a unilateral spectral shift to investigate whether listeners can adapt to upward-shifted speech information presented together with contralateral unshifted information. A six-channel, dichotic, interleaved sine-carrier vocoder simulated a binaurally mismatched frequency-to-place map. Odd channels were presented to one ear with an upward frequency shift equivalent to 6 mm on the basilar membrane, while even channels were presented to the contralateral ear unshifted. In Experiment 1, listeners were trained for 5.3 h with either the binaurally mismatched processor or with just the shifted monaural bands. In Experiment 2, the duration of training was 10 h, and the trained condition alternated between those of Experiment 1. While listeners showed learning in both experiments, intelligibility with the binaurally mismatched processor never exceeded, intelligibility with just the three unshifted bands, suggesting that listeners did not benefit from combining the mismatched maps, even though there was clear scope to do so. Frequency-place map alignment may thus be of importance when optimizing bilateral devices of the type studied here.
Collapse
Affiliation(s)
- Catherine M Siciliano
- Speech, Hearing and Phonetic Sciences, Division of Psychology and Language Sciences, UCL, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom.
| | | | | | | |
Collapse
|
69
|
Goupell MJ, Majdak P, Laback B. Median-plane sound localization as a function of the number of spectral channels using a channel vocoder. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:990-1001. [PMID: 20136221 PMCID: PMC3061453 DOI: 10.1121/1.3283014] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2009] [Revised: 12/04/2009] [Accepted: 12/09/2009] [Indexed: 05/16/2023]
Abstract
Using a vocoder, median-plane sound localization performance was measured in eight normal-hearing listeners as a function of the number of spectral channels. The channels were contiguous and logarithmically spaced in the range from 0.3 to 16 kHz. Acutely testing vocoded stimuli showed significantly worse localization compared to noises and 100 pulses click trains, both of which were tested after feedback training. However, localization for the vocoded stimuli was better than chance. A second experiment was performed using two different 12-channel spacings for the vocoded stimuli, now including feedback training. One spacing was from experiment 1. The second spacing (called the speech-localization spacing) assigned more channels to the frequency range associated with speech. There was no significant difference in localization between the two spacings. However, even with training, localizing 12-channel vocoded stimuli remained worse than localizing virtual wideband noises by 4.8 degrees in local root-mean-square error and 5.2% in quadrant error rate. Speech understanding for the speech-localization spacing was not significantly different from that for a typical spacing used by cochlear-implant users. These experiments suggest that current cochlear implants have a sufficient number of spectral channels for some vertical-plane sound localization capabilities, albeit worse than normal-hearing listeners, without loss of speech understanding.
Collapse
Affiliation(s)
- Matthew J Goupell
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria.
| | | | | |
Collapse
|
70
|
Garadat SN, Litovsky RY, Yu G, Zeng FG. Effects of simulated spectral holes on speech intelligibility and spatial release from masking under binaural and monaural listening. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:977-89. [PMID: 20136220 PMCID: PMC2830263 DOI: 10.1121/1.3273897] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2008] [Revised: 09/20/2009] [Accepted: 11/22/2009] [Indexed: 05/25/2023]
Abstract
The possibility that "dead regions" or "spectral holes" can account for some differences in performance between bilateral cochlear implant (CI) users and normal-hearing listeners was explored. Using a 20-band noise-excited vocoder to simulate CI processing, this study examined effects of spectral holes on speech reception thresholds (SRTs) and spatial release from masking (SRM) in difficult listening conditions. Prior to processing, stimuli were convolved through head-related transfer-functions to provide listeners with free-field directional cues. Processed stimuli were presented over headphones under binaural or monaural (right ear) conditions. Using Greenwood's [(1990). J. Acoust. Soc. Am. 87, 2592-2605] frequency-position function and assuming a cochlear length of 35 mm, spectral holes were created for variable sizes (6 and 10 mm) and locations (base, middle, and apex). Results show that middle-frequency spectral holes were the most disruptive to SRTs, whereas high-frequency spectral holes were the most disruptive to SRM. Spectral holes generally reduced binaural advantages in difficult listening conditions. These results suggest the importance of measuring dead regions in CI users. It is possible that customized programming for bilateral CI processors based on knowledge about dead regions can enhance performance in adverse listening situations.
Collapse
Affiliation(s)
- Soha N Garadat
- Waisman Center, University of Wisconsin, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| | | | | | | |
Collapse
|
71
|
Transfer of auditory perceptual learning with spectrally reduced speech to speech and nonspeech tasks: implications for cochlear implants. Ear Hear 2010; 30:662-74. [PMID: 19773659 DOI: 10.1097/aud.0b013e3181b9c92d] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE The objective of this study was to assess whether training on speech processed with an eight-channel noise vocoder to simulate the output of a cochlear implant would produce transfer of auditory perceptual learning to the recognition of nonspeech environmental sounds, the identification of speaker gender, and the discrimination of talkers by voice. DESIGN Twenty-four normal-hearing subjects were trained to transcribe meaningful English sentences processed with a noise vocoder simulation of a cochlear implant. An additional 24 subjects served as an untrained control group and transcribed the same sentences in their unprocessed form. All subjects completed pre- and post-test sessions in which they transcribed vocoded sentences to provide an assessment of training efficacy. Transfer of perceptual learning was assessed using a series of closed set, nonlinguistic tasks: subjects identified talker gender, discriminated the identity of pairs of talkers, and identified ecologically significant environmental sounds from a closed set of alternatives. RESULTS Although both groups of subjects showed significant pre- to post-test improvements, subjects who transcribed vocoded sentences during training performed significantly better at post-test than those in the control group. Both groups performed equally well on gender identification and talker discrimination. Subjects who received explicit training on the vocoded sentences, however, performed significantly better on environmental sound identification than the untrained subjects. Moreover, across both groups, pre-test speech performance and, to a higher degree, post-test speech performance, were significantly correlated with environmental sound identification. For both groups, environmental sounds that were characterized as having more salient temporal information were identified more often than environmental sounds that were characterized as having more salient spectral information. CONCLUSIONS Listeners trained to identify noise-vocoded sentences showed evidence of transfer of perceptual learning to the identification of environmental sounds. In addition, the correlation between environmental sound identification and sentence transcription indicates that subjects who were better able to use the degraded acoustic information to identify the environmental sounds were also better able to transcribe the linguistic content of novel sentences. Both trained and untrained groups performed equally well ( approximately 75% correct) on the gender-identification task, indicating that training did not have an effect on the ability to identify the gender of talkers. Although better than chance, performance on the talker discrimination task was poor overall ( approximately 55%), suggesting that either explicit training is required to discriminate talkers' voices reliably or that additional information (perhaps spectral in nature) not present in the vocoded speech is required to excel in such tasks. Taken together, the results suggest that although transfer of auditory perceptual learning with spectrally degraded speech does occur, explicit task-specific training may be necessary for tasks that cannot rely on temporal information alone.
Collapse
|
72
|
Fu QJ, Galvin J, Wang X, Nogaki G. Effects of auditory training on adult cochlear implant patients: a preliminary report. Cochlear Implants Int 2009; 5 Suppl 1:84-90. [PMID: 18792249 DOI: 10.1179/cim.2004.5.supplement-1.84] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
The process of learning new electrically stimulated speech patterns can be difficult for many cochlear implant users, especially congenitally deafened patients. Some implant users receive little benefit from the device, even after long-term experience. While many factors may influence individual patient outcomes, the paucity of auditory rehabilitation resources, especially for adult users, may contribute to some implant patients' poorer performance. The present study examined whether moderate auditory training, using speech stimuli, can improve the speech-recognition performance of adult cochlear implant patients. Ten cochlear implant patients with limited speech-recognition capabilities used a recently developed computer-based auditory rehabilitation tool to train at home for a period of one month or longer. Before training began, baseline speech-recognition performance was measured for each patient; baseline performance was measured for at least two weeks, until performance asymptoted. After baseline measures were complete, subjects were instructed to train themselves at home using novel monosyllable words one hour per day, five days per week. Subjects then returned to the lab every two weeks for retesting with the baseline speech materials. Preliminary results showed that there was significant improvement in all patients' speech perception performance after moderate training. While most patients did improve, the amount and time course of improvement was highly variable. Moderate training using a computer-based auditory rehabilitation tool can be an effective approach to improve cochlear implant patients' speech recognition, especially for poorer-performing implant users.
Collapse
Affiliation(s)
- Qian-Jie Fu
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, CA 90057, USA.
| | | | | | | |
Collapse
|
73
|
Abstract
Memories of conversations are composed of what was said (speech content) and information about the speaker's voice (speaker identity). In the current study, we examined whether patients with schizophrenia would show difficulties integrating speech content and speaker identity in memory, as measured in a gender-identity (male/female) recognition task. Forty-one patients and a comparison group of 20 healthy controls took part in the study. In contrast to controls, patients demonstrated greater impairments in memory for female, but not male, voices. These results are consistent with studies of speech perception that show that female voices have more complex "vocal" characteristics and require greater integration compared with male voices, and with the context memory hypothesis of schizophrenia which suggests that memory binding impairments may result in degraded or incomplete representations of memory traces as the task requirements become increasing complex.
Collapse
|
74
|
Sagi E, Fu QJ, Galvin JJ, Svirsky MA. A model of incomplete adaptation to a severely shifted frequency-to-electrode mapping by cochlear implant users. J Assoc Res Otolaryngol 2009; 11:69-78. [PMID: 19774412 DOI: 10.1007/s10162-009-0187-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2009] [Accepted: 08/18/2009] [Indexed: 10/20/2022] Open
Abstract
In the present study, a computational model of phoneme identification was applied to data from a previous study, wherein cochlear implant (CI) users' adaption to a severely shifted frequency allocation map was assessed regularly over 3 months of continual use. This map provided more input filters below 1 kHz, but at the expense of introducing a downwards frequency shift of up to one octave in relation to the CI subjects' clinical maps. At the end of the 3-month study period, it was unclear whether subjects' asymptotic speech recognition performance represented a complete or partial adaptation. To clarify the matter, the computational model was applied to the CI subjects' vowel identification data in order to estimate the degree of adaptation, and to predict performance levels with complete adaptation to the frequency shift. Two model parameters were used to quantify this adaptation; one representing the listener's ability to shift their internal representation of how vowels should sound, and the other representing the listener's uncertainty in consistently recalling these representations. Two of the three CI users could shift their internal representations towards the new stimulation pattern within 1 week, whereas one could not do so completely even after 3 months. Subjects' uncertainty for recalling these representations increased substantially with the frequency-shifted map. Although this uncertainty decreased after 3 months, it remained much larger than subjects' uncertainty with their clinically assigned maps. This result suggests that subjects could not completely remap their phoneme labels, stored in long-term memory, towards the frequency-shifted vowels. The model also predicted that even with complete adaptation, the frequency-shifted map would not have resulted in improved speech understanding. Hence, the model presented here can be used to assess adaptation, and the anticipated gains in speech perception expected from changing a given CI device parameter.
Collapse
Affiliation(s)
- Elad Sagi
- Department of Otolaryngology, New York University School of Medicine, 550 First Avenue, NBV-5E5, New York, NY 10016, USA.
| | | | | | | |
Collapse
|
75
|
Ming VL, Holt LL. Efficient coding in human auditory perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:1312-20. [PMID: 19739745 PMCID: PMC2809690 DOI: 10.1121/1.3158939] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2008] [Revised: 03/14/2009] [Accepted: 06/03/2009] [Indexed: 05/28/2023]
Abstract
Natural sounds possess characteristic statistical regularities. Recent research suggests that mammalian auditory processing maximizes information about these regularities in its internal representation while minimizing encoding cost [Smith, E. C. and Lewicki, M. S. (2006). Nature (London) 439, 978-982]. Evidence for this "efficient coding hypothesis" comes largely from neurophysiology and theoretical modeling [Olshausen, B. A., and Field, D. (2004). Curr. Opin. Neurobiol. 14, 481-487; DeWeese, M., et al. (2003). J. Neurosci. 23, 7940-7949; Klein, D. J., et al. (2003). EURASIP J. Appl. Signal Process. 7, 659-667]. The present research provides behavioral evidence for efficient coding in human auditory perception using six-channel noise-vocoded speech, which drastically limits spectral information and degrades recognition accuracy. Two experiments compared recognition accuracy of vocoder speech created using theoretically-motivated, efficient coding filterbanks derived from the statistical regularities of speech against recognition using standard cochleotopic (logarithmic) or linear filterbanks. Recognition of the speech created using efficient encoding filterbanks was significantly more accurate than either of the other classes. These findings suggest potential applications to cochlear implant design.
Collapse
Affiliation(s)
- Vivienne L Ming
- Redwood Center for Theoretical Neuroscience, University of California at Berkeley, Berkeley, CA 94720, USA.
| | | |
Collapse
|
76
|
Peng SC, Lu N, Chatterjee M. Effects of cooperating and conflicting cues on speech intonation recognition by cochlear implant users and normal hearing listeners. Audiol Neurootol 2009; 14:327-37. [PMID: 19372651 PMCID: PMC2715009 DOI: 10.1159/000212112] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2008] [Accepted: 12/22/2008] [Indexed: 11/19/2022] Open
Abstract
Cochlear implant (CI) recipients have only limited access to fundamental frequency (F0) information, and thus exhibit deficits in speech intonation recognition. For speech intonation, F0 serves as the primary cue, and other potential acoustic cues (e.g. intensity properties) may also contribute. This study examined the effects of cooperating or conflicting acoustic cues on speech intonation recognition by adult CI and normal hearing (NH) listeners with full-spectrum and spectrally degraded speech stimuli. Identification of speech intonation that signifies question and statement contrasts was measured in 13 CI recipients and 4 NH listeners, using resynthesized bi-syllabic words, where F0 and intensity properties were systematically manipulated. The stimulus set was comprised of tokens whose acoustic cues (i.e. F0 contour and intensity patterns) were either cooperating or conflicting. Subjects identified if each stimulus is a 'statement' or a 'question' in a single-interval, 2-alternative forced-choice (2AFC) paradigm. Logistic models were fitted to the data, and estimated coefficients were compared under cooperating and conflicting conditions, between the subject groups (CI vs. NH), and under full-spectrum and spectrally degraded conditions for NH listeners. The results indicated that CI listeners' intonation recognition was enhanced by cooperating F0 contour and intensity cues, but was adversely affected by these cues being conflicting. On the other hand, with full-spectrum stimuli, NH listeners' intonation recognition was not affected by cues being cooperating or conflicting. The effects of cues being cooperating or conflicting were comparable between the CI group and NH listeners with spectrally degraded stimuli. These findings suggest the importance of taking multiple acoustic sources for speech recognition into consideration in aural rehabilitation for CI recipients.
Collapse
Affiliation(s)
- Shu-Chen Peng
- Center for Device and Radiological Health, US Food and Drug Administration, Rockville, MD, USA.
| | | | | |
Collapse
|
77
|
Schvartz KC, Chatterjee M, Gordon-Salant S. Recognition of spectrally degraded phonemes by younger, middle-aged, and older normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:3972-88. [PMID: 19206821 PMCID: PMC2662854 DOI: 10.1121/1.2997434] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The effects of spectral degradation on vowel and consonant recognition abilities were measured in young, middle-aged, and older normal-hearing (NH) listeners. Noise-band vocoding techniques were used to manipulate the number of spectral channels and frequency-to-place alignment, thereby simulating cochlear implant (CI) processing. A brief cognitive test battery was also administered. The performance of younger NH listeners exceeded that of the middle-aged and older listeners, when stimuli were severely distorted (spectrally shifted); the older listeners performed only slightly worse than the middle-aged listeners. Significant intragroup variability was present in the middle-aged and older groups. A hierarchical multiple-regression analysis including data from all three age groups suggested that age was the primary factor related to shifted vowel recognition performance, but verbal memory abilities also contributed significantly to performance. A second regression analysis (within the middle-aged and older groups alone) revealed that verbal memory and speed of processing abilities were better predictors of performance than age alone. The overall results from the current investigation suggested that both chronological age and cognitive capacities contributed to the ability to recognize spectrally degraded phonemes. Such findings have important implications for the counseling and rehabilitation of adult CI recipients.
Collapse
Affiliation(s)
- Kara C Schvartz
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA.
| | | | | |
Collapse
|
78
|
Spahr AJ, Litvak LM, Dorman MF, Bohanan AR, Mishra LN. Simulating the effects of spread of electric excitation on musical tuning and melody identification with a cochlear implant. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2008; 51:1599-606. [PMID: 18664681 PMCID: PMC3683310 DOI: 10.1044/1092-4388(2008/07-0254)] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
PURPOSE To determine why, in a pilot study, only 1 of 11 cochlear implant listeners was able to reliably identify a frequency-to-electrode map where the intervals of a familiar melody were played on the correct musical scale. The authors sought to validate their method and to assess the effect of pitch strength on musical scale recognition in normal-hearing listeners. METHOD Musical notes were generated as either sine waves or spectrally shaped noise bands, with a center frequency equal to that of a desired note and symmetrical (log-scale) reduction in amplitude away from the center frequency. The rate of amplitude reduction was manipulated to vary pitch strength of the notes and to simulate different degrees of current spread. The effect of the simulated degree of current spread was assessed on tasks of musical tuning/scaling, melody recognition, and frequency discrimination. RESULTS Normal-hearing listeners could accurately and reliably identify the appropriate musical scale when stimuli were sine waves or steeply sloping noise bands. Simulating greater current spread degraded performance on all tasks. CONCLUSIONS Cochlear implant listeners with an auditory memory of a familiar melody could likely identify an appropriate frequency-to-electrode map but only in cases where the pitch strength of the electrically produced notes is very high.
Collapse
Affiliation(s)
- Anthony J Spahr
- Department of Speech and Hearing Science, Arizona State University, Lattie F. Coor Hall, Room 3462, Tempe, AZ 85287-0102, USA.
| | | | | | | | | |
Collapse
|
79
|
Assmann PF, Nearey TM. Identification of frequency-shifted vowels. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:3203-3212. [PMID: 19045804 DOI: 10.1121/1.2980456] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Within certain limits, speech intelligibility is preserved with upward or downward scaling of the spectral envelope. To study these limits and assess their interaction with fundamental frequency (F0), vowels in /hVd/ syllables were processed using the STRAIGHT vocoder and presented to listeners for identification. Identification accuracy showed a gradual decline when the spectral envelope was scaled up or down in vowels spoken by men, women, and children. Upward spectral envelope shifts led to poorer identification of children's vowels compared to adults, while downward shifts had a greater impact on men's vowels compared to women and children. Coordinated shifts (F0 and spectral envelope shifted in the same direction) generally produced higher accuracy than conditions with F0 and spectral envelope shifted in opposite directions. Vowel identification was poorest in conditions with very high F0, consistent with suggestions from the literature that sparse sampling of the spectral envelope may be a factor in vowel identification. However, the gradual decline in accuracy as a function of both upward and downward spectral envelope shifts and the interaction between spectral envelope shifts and F0 suggests the additional operation of perceptual mechanisms sensitive to the statistical covariation of F0 and formant frequencies in natural speech.
Collapse
Affiliation(s)
- Peter F Assmann
- School of Behavioral and Brain Sciences, University of Texas at Dallas, Richardson, Texas 75083-0688, USA.
| | | |
Collapse
|
80
|
Zhou N, Xu L. Lexical tone recognition with spectrally mismatched envelopes. Hear Res 2008; 246:36-43. [PMID: 18848614 DOI: 10.1016/j.heares.2008.09.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/29/2008] [Revised: 09/16/2008] [Accepted: 09/17/2008] [Indexed: 12/21/2022]
Abstract
It has been shown that frequency-place mismatch has detrimental effects on English speech recognition. The present study investigated the effects of mismatched spectral distribution of envelopes on Mandarin Chinese tone recognition using a noise-excited vocoder. In Experiment 1, speech samples were processed to simulate a cochlear implant with various insertion depths. The carrier bands were shifted basally relative to the analysis bands by 1-7 mm in the cochlea. Nine normal-hearing Mandarin Chinese listeners participated in this experiment. Basal shift of the carriers only slightly affected tone recognition. The resistance of tone recognition to spectral shift can be attributed to the overall amplitude contour cues that are independent from spectral manipulations. Experiment 2 examined the effects of frequency compression, where widened analysis bands by 2, 6, and 10 mm were compressively allocated to narrower carrier bands. Five of the 9 subjects participated in Experiment 2. It appears that the expanded frequency information especially on the low frequency end can compensate for the distortion from frequency compression. Thus, spectral shift might not pose a severe problem for tone recognition, and allocation of wider frequency range to include more low frequency information might be beneficial for tone recognition.
Collapse
Affiliation(s)
- Ning Zhou
- School of Hearing, Speech and Language Sciences, Ohio University, Athens, OH 45701, USA
| | | |
Collapse
|
81
|
Bilateral speech comprehension reflects differential sensitivity to spectral and temporal features. J Neurosci 2008; 28:8116-23. [PMID: 18685036 DOI: 10.1523/jneurosci.1290-08.2008] [Citation(s) in RCA: 147] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Speech comprehension has been shown to be a strikingly bilateral process, but the differential contributions of the subfields of left and right auditory cortices have remained elusive. The hypothesis that left auditory areas engage predominantly in decoding fast temporal perturbations of a signal whereas the right areas are relatively more driven by changes of the frequency spectrum has not been directly tested in speech or music. This brain-imaging study independently manipulated the speech signal itself along the spectral and the temporal domain using noise-band vocoding. In a parametric design with five temporal and five spectral degradation levels in word comprehension, a functional distinction of the left and right auditory association cortices emerged: increases in the temporal detail of the signal were most effective in driving brain activation of the left anterolateral superior temporal sulcus (STS), whereas the right homolog areas exhibited stronger sensitivity to the variations in spectral detail. In accordance with behavioral measures of speech comprehension acquired in parallel, change of spectral detail exhibited a stronger coupling with the STS BOLD signal. The relative pattern of lateralization (quantified using lateralization quotients) proved reliable in a jack-knifed iterative reanalysis of the group functional magnetic resonance imaging model. This study supplies direct evidence to the often implied functional distinction of the two cerebral hemispheres in speech processing. Applying direct manipulations to the speech signal rather than to low-level surrogates, the results lend plausibility to the notion of complementary roles for the left and right superior temporal sulci in comprehending the speech signal.
Collapse
|
82
|
Pfingst BE, Burkholder-Juhasz RA, Zwolan TA, Xu L. Psychophysical assessment of stimulation sites in auditory prosthesis electrode arrays. Hear Res 2008; 242:172-83. [PMID: 18178350 PMCID: PMC2593127 DOI: 10.1016/j.heares.2007.11.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2007] [Revised: 11/20/2007] [Accepted: 11/20/2007] [Indexed: 12/12/2022]
Abstract
Auditory prostheses use implanted electrode arrays that permit stimulation at many sites along the tonotopic axis of auditory neurons. Psychophysical studies demonstrate that measures of implant function, such as detection and discrimination thresholds, vary considerably across these sites, that the across-site patterns of these measures differ across subjects, and that the likely mechanisms underlying this variability differ across measures. Psychophysical and speech recognition studies suggest that not all stimulation sites contribute equally to perception with the prosthesis and that some sites might have negative effects on perception. Studies that reduce the number of active stimulation sites indicate that most cochlear implant users can effectively utilize a maximum of only about seven sites in their processors. These findings support a strategy for improving implant performance by selecting only the best stimulation sites for the processor map. Another approach is to revise stimulation parameters for ineffective sites in an effort to improve acuity at those sites. In this paper, we discuss data supporting these approaches and some potential pitfalls.
Collapse
Affiliation(s)
- Bryan E Pfingst
- Kresge Hearing Research Institute, Department of Otolaryngology, University of Michigan Health System, Ann Arbor, MI 48109-5506, USA.
| | | | | | | |
Collapse
|
83
|
Liu C, Galvin J, Fu QJ, Narayanan SS. Effect of spectral normalization on different talker speech recognition by cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2836-2847. [PMID: 18529199 PMCID: PMC2676177 DOI: 10.1121/1.2897047] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2007] [Revised: 02/12/2008] [Accepted: 02/23/2008] [Indexed: 05/26/2023]
Abstract
In cochlear implants (CIs), different talkers often produce different levels of speech understanding because of the spectrally distorted speech patterns provided by the implant device. A spectral normalization approach was used to transform the spectral characteristics of one talker to those of another talker. In Experiment 1, speech recognition with two talkers was measured in CI users, with and without spectral normalization. Results showed that the spectral normalization algorithm had small but significant effect on performance. In Experiment 2, the effects of spectral normalization were measured in CI users and normal-hearing (NH) subjects; a pitch-stretching technique was used to simulate six talkers with different fundamental frequencies and vocal tract configurations. NH baseline performance was nearly perfect with these pitch-shift transformations. For CI subjects, while there was considerable intersubject variability in performance with the different pitch-shift transformations, spectral normalization significantly improved the intelligibility of these simulated talkers. The results from Experiments 1 and 2 demonstrate that spectral normalization toward more-intelligible talkers significantly improved CI users' speech understanding with less-intelligible talkers. The results suggest that spectral normalization using optimal reference patterns for individual CI patients may compensate for some of the acoustic variability across talkers.
Collapse
Affiliation(s)
- Chuping Liu
- Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089, USA.
| | | | | | | |
Collapse
|
84
|
Goupell MJ, Laback B, Majdak P, Baumgartner WD. Effects of upper-frequency boundary and spectral warping on speech intelligibility in electrical stimulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2295-309. [PMID: 18397034 PMCID: PMC3061454 DOI: 10.1121/1.2831738] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Speech understanding was tested for seven listeners using 12-electrode Med-El cochlear implants (CIs) and six normal-hearing listeners using a CI simulation. Eighteen different types of processing were evaluated, which varied the frequency-to-tonotopic place mapping and the upper boundary of the frequency and stimulation range. Spectrally unwarped and warped conditions were included. Unlike previous studies on this topic, the lower boundary of the frequency and stimulation range was fixed while the upper boundary was varied. For the unwarped conditions, only eight to ten channels were needed in both quiet and noise to achieve no significant degradation in speech understanding compared to the normal 12-electrode speech processing. The unwarped conditions were often the best conditions for understanding speech; however, small changes in frequency-to-place mapping (<0.77 octaves for the most basal electrode) yielded no significant degradation in performance from the nearest unwarped condition. A second experiment measured the effect of feedback training for both the unwarped and warped conditions. Improvements were found for the unwarped and frequency-expanded conditions, but not for the compressed condition. These results have implications for new CI processing strategies, such as the inclusion of spectral localization cues.
Collapse
Affiliation(s)
- Matthew J Goupell
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria.
| | | | | | | |
Collapse
|
85
|
Reiss LAJ, Gantz BJ, Turner CW. Cochlear implant speech processor frequency allocations may influence pitch perception. Otol Neurotol 2008; 29:160-7. [PMID: 18025998 PMCID: PMC4243703 DOI: 10.1097/mao.0b013e31815aedf4] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE To investigate the effects of assigning cochlear implant speech processor frequencies normally associated with more apical cochlear locations to the shallow insertion depths of the Iowa/Nucleus Hybrid electrode. STUDY DESIGN Subjects using the Hybrid implant for more than 1 year were tested on speech recognition with Consonant-Nucleus-Consonant words and consonant stimuli. Pitch sensations of individual electrodes were also measured electrically through the implant and acoustically in the contralateral ear. SETTING Tertiary care center. RESULTS Most subjects showed large improvements in speech recognition within 12 months after implantation. Furthermore, after longer periods of 24-plus months, some individuals were able to achieve high levels of consonant discrimination with electric-only processing comparable to long-electrode patients with deeper electrode insertions. Pitch perceptions obtained from individual electrodes in these subjects were closer to the frequency map assigned an electrode than the place-frequency predicted from cochlear location. CONCLUSION These results suggest that over time, pitch sensations may be determined more by the implant map than by cochlear location. In other words, the brain may adapt to spectral mismatches by remapping pitch. Furthermore, patients can perform well with shifted frequency allocations for speech recognition. The successful application of shifted frequency allocations also supports the idea of shallower insertions and greater preservation of residual hearing for all cochlear implants, regardless of the patient's frequency range of usable residual hearing.
Collapse
Affiliation(s)
- Lina A J Reiss
- Department of Speech Pathology and Audiology, The University of Iowa, Iowa City, Iowa 52242, U.S.A.
| | | | | |
Collapse
|
86
|
Abstract
Learning electrically stimulated speech patterns can be a new and difficult experience for cochlear implant (CI) recipients. Recent studies have shown that most implant recipients at least partially adapt to these new patterns via passive, daily-listening experiences. Gradually introducing a speech processor parameter (eg, the degree of spectral mismatch) may provide for more complete and less stressful adaptation. Although the implant device restores hearing sensation and the continued use of the implant provides some degree of adaptation, active auditory rehabilitation may be necessary to maximize the benefit of implantation for CI recipients. Currently, there are scant resources for auditory rehabilitation for adult, postlingually deafened CI recipients. We recently developed a computer-assisted speech-training program to provide the means to conduct auditory rehabilitation at home. The training software targets important acoustic contrasts among speech stimuli, provides auditory and visual feedback, and incorporates progressive training techniques, thereby maintaining recipients' interest during the auditory training exercises. Our recent studies demonstrate the effectiveness of targeted auditory training in improving CI recipients' speech and music perception. Provided with an inexpensive and effective auditory training program, CI recipients may find the motivation and momentum to get the most from the implant device.
Collapse
Affiliation(s)
- Qian-Jie Fu
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.
| | | |
Collapse
|
87
|
Abstract
OBJECTIVE To explore combined acute effects of frequency shift and compression-expansion on speech recognition, using noiseband vocoder processing. DESIGN Recognition of vowels and consonants, processed with a noiseband vocoder, was measured with five normal-hearing subjects, between the ages of 27 and 35 yr. The speech signal was filtered into 8 or 16 analysis bands and the envelopes were extracted from each band. The carrier noise bands were modulated by the envelopes and resynthesized to produce the processed speech. In the baseline matched condition, the frequency ranges of the corresponding analysis and carrier bands were the same. In the shift only condition, the frequency ranges of the carrier bands were shifted up or down relative to the analysis bands. In the compression and expansion only conditions, the analysis band range was made larger or smaller, respectively, than the carrier band range. By applying the shift to carrier bands and compression or expansion to analysis bands simultaneously, the combined effects of the two spectral distortions on speech recognition were explored. RESULTS When the spectral distortions of compression-expansion or shift were applied separately, the performance was reduced from the baseline matched condition. However, when the two spectral degradations were applied simultaneously, a compensatory effect was observed; the reduction in performance was smaller for some combinations compared to the reduction observed for each distortion individually. CONCLUSIONS The results of the present study are consistent with previous vocoder studies with normal-hearing subjects that showed a negative effect of spectral mismatch between analysis and carrier bands on speech recognition. The present results further show that matching the frequency ranges of 1 to 2 kHz, which contain important speech information, can be more beneficial for speech recognition than matching the overall frequency ranges, in certain conditions.
Collapse
Affiliation(s)
- Deniz Başkent
- Department of Biomedical Engineering, University of Southern California, Los Angeles, USA.
| | | |
Collapse
|
88
|
Reiss LAJ, Turner CW, Erenberg SR, Gantz BJ. Changes in pitch with a cochlear implant over time. J Assoc Res Otolaryngol 2007; 8:241-57. [PMID: 17347777 PMCID: PMC2538353 DOI: 10.1007/s10162-007-0077-8] [Citation(s) in RCA: 117] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2006] [Accepted: 01/26/2007] [Indexed: 10/23/2022] Open
Abstract
In the normal auditory system, the perceived pitch of a tone is closely linked to the cochlear place of vibration. It has generally been assumed that high-rate electrical stimulation by a cochlear implant electrode also evokes a pitch sensation corresponding to the electrode's cochlear place ("place" code) and stimulation rate ("temporal" code). However, other factors may affect electric pitch sensation, such as a substantial loss of nearby nerve fibers or even higher-level perceptual changes due to experience. The goals of this study were to measure electric pitch sensations in hybrid (short-electrode) cochlear implant patients and to examine which factors might contribute to the perceived pitch. To look at effects of experience, electric pitch sensations were compared with acoustic tone references presented to the non-implanted ear at various stages of implant use, ranging from hookup to 5 years. Here, we show that electric pitch perception often shifts in frequency, sometimes by as much as two octaves, during the first few years of implant use. Additional pitch measurements in more recently implanted patients at shorter time intervals up to 1 year of implant use suggest two likely contributions to these observed pitch shifts: intersession variability (up to one octave) and slow, systematic changes over time. We also found that the early pitch sensations for a constant electrode location can vary greatly across subjects and that these variations are strongly correlated with speech reception performance. Specifically, patients with an early low-pitch sensation tend to perform poorly with the implant compared to those with an early high-pitch sensation, which may be linked to less nerve survival in the basal end of the cochlea in the low-pitch patients. In contrast, late pitch sensations show no correlation with speech perception. These results together suggest that early pitch sensations may more closely reflect peripheral innervation patterns, while later pitch sensations may reflect higher-level, experience-dependent changes. These pitch shifts over time not only raise questions for strict place-based theories of pitch perception, but also imply that experience may have a greater influence on cochlear implant perception than previously thought.
Collapse
Affiliation(s)
- Lina A J Reiss
- Department of Speech Pathology and Audiology, Wendell Johnson Speech and Hearing Center, University of Iowa, Iowa City, IA 52242, USA.
| | | | | | | |
Collapse
|
89
|
Başkent D, Edwards B. Simulating listener errors in using genetic algorithms for perceptual optimization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:EL238-43. [PMID: 17552575 DOI: 10.1121/1.2731017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The genetic algorithm (GA) was previously suggested for fitting hearing aid or cochlear implant features by using listener's subjective judgment. In the present study, two human factors that might affect the outcome of the GA when used for perceptual optimization were explored with simulations. Listeners with varying sensitivity in discriminating sentences of different intelligibility and with varying error rates in entering their judgment to the GA were simulated. A comparison of the simulation results with the results from human subjects reported by Başkent et al. Ear Hear. 28(3) 277-289 (2007) showed that these factors could reduce the performance of the GA considerably.
Collapse
Affiliation(s)
- Deniz Başkent
- Starkey Hearing Research Center, 2150 Shattuck Ave., Ste. 408, Berkeley, California 94704, USA
| | | |
Collapse
|
90
|
Başkent D, Eiler CL, Edwards B. Using genetic algorithms with subjective input from human subjects: implications for fitting hearing aids and cochlear implants. Ear Hear 2007; 28:370-80. [PMID: 17485986 DOI: 10.1097/aud.0b013e318047935e] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE To present a comprehensive analysis of the feasibility of genetic algorithms (GA) for finding the best fit of hearing aids or cochlear implants for individual users in clinical or research settings, where the algorithm is solely driven by subjective human input. DESIGN Due to varying pathology, the best settings of an auditory device differ for each user. It is also likely that listening preferences vary at the same time. The settings of a device customized for a particular user can only be evaluated by the user. When optimization algorithms are used for fitting purposes, this situation poses a difficulty for a systematic and quantitative evaluation of the suitability of the fitting parameters produced by the algorithm. In the present study, an artificial listening environment was generated by distorting speech using a noiseband vocoder. The settings produced by the GA for this listening problem could objectively be evaluated by measuring speech recognition and comparing the performance to the best vocoder condition where speech was least distorted. Nine normal-hearing subjects participated in the study. The parameters to be optimized were the number of vocoder channels, the shift between the input frequency range and the synthesis frequency range, and the compression-expansion of the input frequency range over the synthesis frequency range. The subjects listened to pairs of sentences processed with the vocoder, and entered a preference for the sentence with better intelligibility. The GA modified the solutions iteratively according to the subject preferences. The program converged when the user ranked the same set of parameters as the best in three consecutive steps. The results produced by the GA were analyzed for quality by measuring speech intelligibility, for test-retest reliability by running the GA three times with each subject, and for convergence properties. RESULTS Speech recognition scores averaged across subjects were similar for the best vocoder solution and for the solutions produced by the GA. The average number of iterations was 8 and the average convergence time was 25.5 minutes. The settings produced by different GA runs for the same subject were slightly different; however, speech recognition scores measured with these settings were similar. Individual data from subjects showed that in each run, a small number of GA solutions produced poorer speech intelligibility than for the best setting. This was probably a result of the combination of the inherent randomness of the GA, the convergence criterion used in the present study, and possible errors that the users might have made during the paired comparisons. On the other hand, the effect of these errors was probably small compared to the other two factors, as a comparison between subjective preferences and objective measures showed that for many subjects the two were in good agreement. CONCLUSIONS The results showed that the GA was able to produce good solutions by using listener preferences in a relatively short time. For practical applications, the program can be made more robust by running the GA twice or by not using an automatic stopping criterion, and it can be made faster by optimizing the number of the paired comparisons completed in each iteration.
Collapse
Affiliation(s)
- Deniz Başkent
- Starkey Hearing Research Center, Berkeley, California 94704, USA.
| | | | | |
Collapse
|
91
|
Stacey PC, Summerfield AQ. Effectiveness of computer-based auditory training in improving the perception of noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:2923-35. [PMID: 17550190 DOI: 10.1121/1.2713668] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Five experiments were designed to evaluate the effectiveness of "high-variability" lexical training in improving the ability of normal-hearing subjects to perceive noise-vocoded speech that had been spectrally shifted to simulate tonotopic misalignment. Two approaches to training were implemented. One training approach required subjects to recognize isolated words, while the other training approach required subjects to recognize words in sentences. Both approaches to training improved the ability to identify words in sentences. Improvements following a single session (lasting 1-2 h) of auditory training ranged between 7 and 12 %pts and were significantly larger than improvements following a visual control task that was matched with the auditory training task in terms of the response demands. An additional three sessions of word- and sentence-based training led to further improvements, with the average overall improvement ranging from 13 to 18% pts. When a tonotopic misalignment of 3 mm rather than 6 mm was simulated, training with several talkers led to greater generalization to new talkers than training with a single talker. The results confirm that computer-based lexical training can help overcome the effects of spectral distortions in speech, and they suggest that training materials are most effective when several talkers are included.
Collapse
Affiliation(s)
- Paula C Stacey
- Department of Psychology, University of York, Heslington, York YO10 5DD, United Kingdom.
| | | |
Collapse
|
92
|
Lin YS, Lee FP, Huang IS, Peng SC. Continuous improvement in Mandarin lexical tone perception as the number of channels increased: a simulation study of cochlear implant. Acta Otolaryngol 2007; 127:505-14. [PMID: 17453477 DOI: 10.1080/00016480600951434] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
CONCLUSION With reference to English phoneme recognition, where performance usually does not improve after six or eight channels in cochlear implants (CIs), increasing total channel numbers continuously improved perception of Mandarin tones. OBJECTIVE To test our hypothesis that current CI strategies might be modified to improve Mandarin lexical tonal perception. MATERIALS AND METHODS Lexical tonal perception tests using 48 monosyllables in Mandarin Chinese were conducted in 32 native Mandarin speakers with normal hearing. The performance of tonal perception was compared among the controlled factors, which were total channel number, number of channels allocated to the F0 spectrum, and whether there were spectral shifts in the electrode configuration. The experimental condition that preserves fine structure was used as a comparison. RESULTS The signal processing strategy using 16 channels--which is technically possible with current CI devices--produced better tonal perception than those using 12 or 8 channels. Increasing the number of fundamental channels did not improve tonal perception, and spectral shifts did not change tonal perception. An experimental condition (FiC12) that preserves the fine structure produced significantly better overall scores for tone perception than other experimental conditions with envelope strategies.
Collapse
Affiliation(s)
- Yung-Song Lin
- Department of Otolaryngology, Taipei Medical University, Chi Mei Medical Center, Tainan city, Taiwan, ROC.
| | | | | | | |
Collapse
|
93
|
Mitani C, Nakata T, Trehub SE, Kanda Y, Kumagami H, Takasaki K, Miyamoto I, Takahashi H. Music Recognition, Music Listening, and Word Recognition by Deaf Children with Cochlear Implants. Ear Hear 2007; 28:29S-33S. [PMID: 17496641 DOI: 10.1097/aud.0b013e318031547a] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES To examine the ability of congenitally deaf children to recognize music from incidental exposure and the relations among age at implantation, music listening, and word recognition. DESIGN Seventeen child implant users who were 4 to 8 yr of age were tested on their recognition and liking of musical excerpts from their favorite television programs. They were also assessed on open-set recognition of three-syllable words. Their parents completed a questionnaire about the children's musical activities. RESULTS Children identified the musical excerpts at better than chance levels, but only when they heard the original vocal/instrumental versions. Children's initiation of music listening at home was associated with younger ages at implantation and higher word recognition scores. CONCLUSIONS Child implant users enjoy music more than adult implant users. Moreover, younger age at implantation increases children's engagement with music, which may enhance their progress in other auditory domains.
Collapse
Affiliation(s)
- Chisato Mitani
- Graduate Course in Humanistic Studies Specialized Research into Humanistic Studies, Department of Psychology, Nagasaki Junshin Catholic University, Nagasaki, Japan
| | | | | | | | | | | | | | | |
Collapse
|
94
|
Dorman MF, Spahr T, Gifford R, Loiselle L, McKarns S, Holden T, Skinner M, Finley C. An electric frequency-to-place map for a cochlear implant patient with hearing in the nonimplanted ear. J Assoc Res Otolaryngol 2007; 8:234-40. [PMID: 17351713 PMCID: PMC2441831 DOI: 10.1007/s10162-007-0071-1] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2006] [Accepted: 12/29/2006] [Indexed: 11/26/2022] Open
Abstract
The aim of this study was to relate the pitch of high-rate electrical stimulation delivered to individual cochlear implant electrodes to electrode insertion depth and insertion angle. The patient (CH1) was able to provide pitch matches between electric and acoustic stimulation because he had auditory thresholds in his nonimplanted ear ranging between 30 and 60 dB HL over the range, 250 Hz to 8 kHz. Electrode depth and insertion angle were measured from high-resolution computed tomography (CT) scans of the patient's temporal bones. The scans were used to create a 3D image volume reconstruction of the cochlea, which allowed visualization of electrode position within the scala. The method of limits was used to establish pitch matches between acoustic pure tones and electric stimulation (a 1,652-pps, unmodulated, pulse train). The pitch matching data demonstrated that, for insertion angles of greater than 450 degrees or greater than approximately 20 mm insertion depth, pitch saturated at approximately 420 Hz. From 20 to 15 mm insertion depth pitch estimates were about one-half octave lower than the Greenwood function. From 13 to 3 mm insertion depth the pitch estimates were approximately one octave lower than the Greenwood function. The pitch match for an electrode only 3.4 mm into the cochlea was 3,447 Hz. These data are consistent with other reports, e.g., Boëx et al. (2006), of a frequency-to-place map for the electrically stimulated cochlea in which perceived pitches for stimulation on individual electrodes are significantly lower than those predicted by the Greenwood function for stimulation at the level of the hair cell.
Collapse
Affiliation(s)
- Michael F Dorman
- Department of Speech and Hearing Science, Arizona State University, Tempe, AZ 85287-0102, USA.
| | | | | | | | | | | | | | | |
Collapse
|
95
|
Stakhovskaya O, Sridhar D, Bonham BH, Leake PA. Frequency map for the human cochlear spiral ganglion: implications for cochlear implants. J Assoc Res Otolaryngol 2007; 8:220-33. [PMID: 17318276 PMCID: PMC2394499 DOI: 10.1007/s10162-007-0076-9] [Citation(s) in RCA: 301] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2006] [Accepted: 01/20/2007] [Indexed: 10/23/2022] Open
Abstract
The goals of this study were to derive a frequency-position function for the human cochlear spiral ganglion (SG) to correlate represented frequency along the organ of Corti (OC) to location along the SG, to determine the range of individual variability, and to calculate an "average" frequency map (based on the trajectories of the dendrites of the SG cells). For both OC and SG frequency maps, a potentially important limitation is that accurate estimates of cochlear place frequency based upon the Greenwood function require knowledge of the total OC or SG length, which cannot be determined in most temporal bone and imaging studies. Therefore, an additional goal of this study was to evaluate a simple metric, basal coil diameter that might be utilized to estimate OC and SG length. Cadaver cochleae (n = 9) were fixed <24 h postmortem, stained with osmium tetroxide, microdissected, decalcified briefly, embedded in epoxy resin, and examined in surface preparations. In digital images, the OC and SG were measured, and the radial nerve fiber trajectories were traced to define a series of frequency-matched coordinates along the two structures. Images of the cochlear turns were reconstructed and measurements of basal turn diameter were made and correlated with OC and SG measurements. The data obtained provide a mathematical function for relating represented frequency along the OC to that of the SG. Results showed that whereas the distance along the OC that corresponds to a critical bandwidth is assumed to be constant throughout the cochlea, estimated critical band distance in the SG varies significantly along the spiral. Additional findings suggest that measurements of basal coil diameter in preoperative images may allow prediction of OC/SG length and estimation of the insertion depth required to reach specific angles of rotation and frequencies. Results also indicate that OC and SG percentage length expressed as a function of rotation angle from the round window is fairly constant across subjects. The implications of these findings for the design and surgical insertion of cochlear implants are discussed.
Collapse
Affiliation(s)
- Olga Stakhovskaya
- Epstein Laboratory, Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, CA 94143-0526, USA.
| | | | | | | |
Collapse
|
96
|
Gani M, Valentini G, Sigrist A, Kós MI, Boëx C. Implications of deep electrode insertion on cochlear implant fitting. J Assoc Res Otolaryngol 2007; 8:69-83. [PMID: 17216585 PMCID: PMC2538415 DOI: 10.1007/s10162-006-0065-4] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2006] [Accepted: 10/30/2006] [Indexed: 11/30/2022] Open
Abstract
Using long Med-El Combi40+ electrode arrays, it is now possible to cover the whole range of the cochlea, up to about two turns. Such insertion depths have received little attention. To evaluate the contribution of deeply inserted electrodes, five Med-El cochlear implant users were tested on vowel and consonant identification tests with fittings with first one, two, and up to five apical electrodes being deactivated. In addition, subjects performed pitch-ranking experiments, using loudness-balanced stimuli, to identify electrodes creating pitch confusions. Radiographs were taken to measure each electrode insertion depth. All subjects used each modified fitting for two periods of about 3 weeks. During the experiment, the same stimulation rate and frequency range were maintained across all the fittings used for each individual subject. After each trial period the subject had to perform three consonant and three vowel identification tests. All subjects showed deep electrode insertions ranging from 605 degrees to 720 degrees. The two subjects with the deepest electrode insertions showed significantly increased vowel- and consonant-identification performances with fittings with the two or three most apical electrodes deactivated compared to their standard fitting with all available electrodes activated. The other three subjects did not show significant improvements in performance when one or two of their most apical electrodes were deactivated. Four out of five subjects preferred to continue use of a fitting with one or more apical electrodes deactivated. The two subjects with the deepest insertions also showed pitch confusions between their most apical electrodes. Two possible reasons for these results are discussed. One is to reduce neural interactions related to electrodes producing pitch confusions. Another is to improve the alignment of the frequency components of sounds coded by the electrical signals delivered to each electrode to the overall pitch of the auditory perception produced by the electrical stimulation of auditory nerve fibers.
Collapse
Affiliation(s)
- Mathieu Gani
- “Centre Romand d’Implants Cochléaires” Department of Otolaryngology-Head and Neck Surgery, University Hospital of Geneva, Geneva, Switzerland
| | - Gregory Valentini
- “Centre Romand d’Implants Cochléaires” Department of Otolaryngology-Head and Neck Surgery, University Hospital of Geneva, Geneva, Switzerland
| | - Alain Sigrist
- “Centre Romand d’Implants Cochléaires” Department of Otolaryngology-Head and Neck Surgery, University Hospital of Geneva, Geneva, Switzerland
| | - Maria-Izabel Kós
- “Centre Romand d’Implants Cochléaires” Department of Otolaryngology-Head and Neck Surgery, University Hospital of Geneva, Geneva, Switzerland
| | - Colette Boëx
- “Centre Romand d’Implants Cochléaires” Department of Otolaryngology-Head and Neck Surgery, University Hospital of Geneva, Geneva, Switzerland
- Department of Neurology, University Hospital of Geneva, Geneva, Switzerland
- Clinique et Policlinique de Neurologie, Hôpitaux Universitaires de Genève, Rue Micheli-du-Crest, 24, CH-1211 Genève 14, Switzerland
| |
Collapse
|
97
|
Abstract
Because there are many parameters in the cochlear implant (CI) device that can be optimized for individual patients, it is important to estimate a parameter's effect before patient evaluation. In this paper, Mel-frequency cepstrum coefficients (MFCCs) were used to estimate the acoustic vowel space for vowel stimuli processed by the CI simulations. The acoustic space was then compared to vowel recognition performance by normal-hearing subjects listening to the same processed speech. Five CI speech processor parameters were simulated to produce different degree of spectral resolution, spectral smearing, spectral warping, spectral shifting, and amplitude distortion. The acoustic vowel space was highly correlated with normal hearing subjects' vowel recognition performance for parameters that affected the spectral channels and spectral smearing. However, the acoustic vowel space was not significantly correlated with perceptual performance for parameters that affected the degree of spectral warping, spectral shifting, and amplitude distortion. In particular, while spectral warping and shifting did not significantly reshape the acoustic space, vowel recognition performance was significantly affected by these parameters. The results from the acoustic analysis suggest that the CI device can preserve phonetic distinctions under conditions of spectral warping and shifting. Auditory training may help CI patients better perceive these speech cues transmitted by their speech processors.
Collapse
Affiliation(s)
- Chuping Liu
- Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90007, USA.
| | | |
Collapse
|
98
|
Smith MW, Faulkner A. Perceptual adaptation by normally hearing listeners to a simulated "hole" in hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:4019-30. [PMID: 17225428 DOI: 10.1121/1.2359235] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Simulations of cochlear implants have demonstrated that the deleterious effects of a frequency misalignment between analysis bands and characteristic frequencies at basally shifted simulated electrode locations are significantly reduced with training. However, a distortion of frequency-to-place mapping may also arise due to a region of dysfunctional neurons that creates a "hole" in the tonotopic representation. This study simulated a 10 mm hole in the mid-frequency region. Noise-band processors were created with six output bands (three apical and three basal to the hole). The spectral information that would have been represented in the hole was either dropped or reassigned to bands on either side. Such reassignment preserves information but warps the place code, which may in itself impair performance. Normally hearing subjects received three hours of training in two reassignment conditions. Speech recognition improved considerably with training. Scores were much lower in a baseline (untrained) condition where information from the hole region was dropped. A second group of subjects trained in this dropped condition did show some improvement; however, scores after training were significantly lower than in the reassignment conditions. These results are consistent with the view that speech processors should present the most informative frequency range irrespective of frequency misalignment.
Collapse
Affiliation(s)
- Matthew W Smith
- Department of Phonetics and Linguistics, UCL, Wolfson House, 4 Stephenson Way, London NW1 2HE, United Kingdom.
| | | |
Collapse
|
99
|
Li T, Fu QJ. Perceptual adaptation to spectrally shifted vowels: training with nonlexical labels. J Assoc Res Otolaryngol 2006; 8:32-41. [PMID: 17131213 PMCID: PMC2538416 DOI: 10.1007/s10162-006-0059-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2005] [Accepted: 10/09/2006] [Indexed: 11/30/2022] Open
Abstract
Although normal-hearing (NH) and cochlear implant (CI) listeners are able to adapt to spectrally shifted speech to some degree, auditory training has been shown to provide more complete and/or accelerated adaptation. However, it is unclear whether listeners use auditory and visual feedback to improve discrimination of speech stimuli, or to learn the identity of speech stimuli. The present study investigated the effects of training with lexical and nonlexical labels on NH listeners' perceptual adaptation to spectrally degraded and spectrally shifted vowels. An eight-channel sine wave vocoder was used to simulate CI speech processing. Two degrees of spectral shift (moderate and severe shift) were studied with three training paradigms, including training with lexical labels (i.e., "hayed," "had," "who'd," etc.), training with nonlexical labels (i.e., randomly assigned letters "f," "b," "g," etc.), and repeated testing with lexical labels (i.e., "test-only" paradigm without feedback). All training and testing was conducted over 5 consecutive days, with two to four training exercises per day. Results showed that with the test-only paradigm, lexically labeled vowel recognition significantly improved for moderately shifted vowels; however, there was no significant improvement for severely shifted vowels. Training with nonlexical labels significantly improved the recognition of nonlexically labeled vowels for both shift conditions; however, this improvement failed to generalize to lexically labeled vowel recognition with severely shifted vowels. Training with lexical labels significantly improved lexically labeled vowel recognition with severely shifted vowels. These results suggest that storage and retrieval of speech patterns in the central nervous system is somewhat robust to tonotopic distortion and spectral degradation. Although training with nonlexical labels may improve discrimination of spectrally distorted peripheral patterns, lexically meaningful feedback is needed to identify these peripheral patterns. The results also suggest that training with lexically meaningful feedback may be beneficial to CI users, especially patients with shallow electrode insertion depths.
Collapse
Affiliation(s)
- Tianhao Li
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA.
| | | |
Collapse
|
100
|
Faulkner A, Rosen S, Norman C. The right information may matter more than frequency-place alignment: simulations of frequency-aligned and upward shifting cochlear implant processors for a shallow electrode array insertion. Ear Hear 2006; 27:139-52. [PMID: 16518142 DOI: 10.1097/01.aud.0000202357.40662.85] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE It has been claimed that speech recognition with a cochlear implant is dependent on the correct frequency alignment of analysis bands in the speech processor with characteristic frequencies (CFs) at electrode locations. However, the use of filters aligned in frequency to a relatively basal electrode array position leads to significant loss of lower frequency speech information. This study uses an acoustic simulation to compare two approaches to the matching of speech processor filters to an electrode array having a relatively shallow depth within the typical range, such that the most apical element is at a CF of 1851 Hz. Two noise-excited vocoder speech processors are compared, one with CF-matched filters, and one with filters matched to CFs at basilar membrane locations 6 mm more apical than electrode locations. DESIGN An extended crossover training design examined pre- and post-training performance in the identification of vowels and words in sentences for both processors. Subjects received about 3 hours of training with each processor in turn. RESULTS Training improved performance with both processors, but training effects were greater for the shifted processor. For a male talker, the shifted processor led to higher post-training scores than the frequency-aligned processor with both vowels and sentences. For a female talker, post-training vowel scores did not differ significantly between processors, whereas sentence scores were higher with the frequency-aligned processor. CONCLUSIONS Even for a shallow electrode insertion, we conclude that a speech processor should represent information from important frequency regions below 1 kHz and that the possible cost of frequency misalignment can be significantly reduced with listening experience.
Collapse
Affiliation(s)
- Andrew Faulkner
- Department of Phonetics and Linguistics, University College London, Wolfson House, London, United Kingdom.
| | | | | |
Collapse
|