101
|
Laneau J, Moonen M, Wouters J. Factors affecting the use of noise-band vocoders as acoustic models for pitch perception in cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:491-506. [PMID: 16454303 DOI: 10.1121/1.2133391] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Although in a number of experiments noise-band vocoders have been shown to provide acoustic models for speech perception in cochlear implants (CI), the present study assesses in four experiments whether and under what limitations noise-band vocoders can be used as an acoustic model for pitch perception in CI. The first two experiments examine the effect of spectral smearing on simulated electrode discrimination and fundamental frequency (FO) discrimination. The third experiment assesses the effect of spectral mismatch in an FO-discrimination task with two different vocoders. The fourth experiment investigates the effect of amplitude compression on modulation rate discrimination. For each experiment, the results obtained from normal-hearing subjects presented with vocoded stimuli are compared to results obtained directly from CI recipients. The results show that place pitch sensitivity drops with increased spectral smearing and that place pitch cues for multi-channel stimuli can adequately be mimicked when the discriminability of adjacent channels is adjusted by varying the spectral slopes to match that of CI subjects. The results also indicate that temporal pitch sensitivity is limited for noise-band carriers with low center frequencies and that the absence of a compression function in the vocoder might alter the saliency of the temporal pitch cues.
Collapse
Affiliation(s)
- Johan Laneau
- Laboratory for Experimental ORL, K.U.Leuven, Kapucijnenvoer 33, B 3000 Leuven, Belgium.
| | | | | |
Collapse
|
102
|
Fu QJ, Galvin J, Wang X, Nogaki G. Effects of auditory training on adult cochlear implant patients: a preliminary report. Cochlear Implants Int 2006. [DOI: 10.1002/cii.181] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
103
|
Fu QJ, Nogaki G, Galvin JJ. Auditory training with spectrally shifted speech: implications for cochlear implant patient auditory rehabilitation. J Assoc Res Otolaryngol 2005; 6:180-9. [PMID: 15952053 PMCID: PMC2538336 DOI: 10.1007/s10162-005-5061-6] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2004] [Accepted: 03/04/2005] [Indexed: 10/25/2022] Open
Abstract
After implantation, postlingually deafened cochlear implant (CI) patients must adapt to both spectrally reduced and spectrally shifted speech, due to the limited number of electrodes and the limited length of the electrode array. This adaptation generally occurs during the first three to six months of implant use and may continue for many years. To see whether moderate speech training can accelerate this learning process, 16 naïve, normal-hearing listeners were trained with spectrally shifted speech via an eight-channel acoustic simulation of CI speech processing. Baseline vowel and consonant recognition was measured for both spectrally shifted and unshifted speech. Short daily training sessions were conducted over five consecutive days, using four different protocols. For the test-only protocol, no improvement was seen over the five-day period. Similarly, sentence training provided little benefit for vowel recognition. However, after five days of targeted phoneme training, subjects' recognition of spectrally shifted vowels significantly improved in most subjects. This improvement did not generalize to the spectrally unshifted vowel and consonant tokens, suggesting that subjects adapted to the specific spectral shift, rather than to the eight-channel processing in general. Interestingly, significant improvement was also observed for the recognition of spectrally shifted consonants. The largest improvement was observed with targeted vowel contrast training, which did not include any explicit consonant training. These results suggest that targeted phoneme training can accelerate adaptation to spectrally shifted speech. Given these results with normal-hearing listeners, auditory rehabilitation tools that provide targeted phoneme training may be effective in improving the speech recognition performance of adult CI users.
Collapse
Affiliation(s)
- Qian-Jie Fu
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, CA, USA.
| | | | | |
Collapse
|
104
|
Başkent D, Shannon RV. Interactions between cochlear implant electrode insertion depth and frequency-place mapping. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 117:1405-1416. [PMID: 15807028 DOI: 10.1121/1.1856273] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
While new electrode designs allow deeper insertion and wider coverage in the cochlea, there is still considerable variation in the insertion depth of the electrode array among cochlear implant users. The present study measures speech recognition as a function of insertion depth, varying from a deep insertion of 10 electrodes at 28.8 mm to a shallow insertion of a single electrode at 7.2 mm, in four Med-El Combi 40+ users. Short insertion depths were simulated by inactivating apical electrodes. Speech recognition increased with deeper insertion, reaching an asymptotic level at 21.6 or 26.4 mm depending on the frequency-place map used. Başkent and Shannon [J. Acoust. Soc. Am. 116, 3130-3140 (2004)] showed that speech recognition by implant users was best when the acoustic input frequency was matched onto the cochlear location that normally processes that frequency range, minimizing the spectral distortions in the map. However, if an electrode array is not fully inserted into the cochlea, a matched map will result in the loss of considerable low-frequency information. The results show a strong interaction between the optimal frequency-place mapping and electrode insertion depth. Consistent with previous studies, frequency-place matching produced better speech recognition than compressing the full speech range onto the electrode array for full insertion ranges (20 to 25 mm from the round window). For shallower insertions (16.8 and 19.2 mm) a mild amount of frequency-place compression was better than truncating the frequency range to match the basal cochlear location. These results show that patients with shallow electrode insertions might benefit from a map that assigns a narrower frequency range than patients with full insertions.
Collapse
Affiliation(s)
- Deniz Başkent
- Department of Biomedical Engineering, University of Southern California, Los Angeles, California 90089, USA.
| | | |
Collapse
|
105
|
Glueckert R, Pfaller K, Kinnefors A, Rask-Andersen H, Schrott-Fischer A. The Human Spiral Ganglion: New Insights into Ultrastructure, Survival Rate and Implications for Cochlear Implants. ACTA ACUST UNITED AC 2005; 10:258-73. [PMID: 15925863 DOI: 10.1159/000086000] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2004] [Accepted: 01/27/2005] [Indexed: 11/19/2022]
Abstract
This study was based on high-resolution SEM assessment of freshly fixed, normal-hearing, human inner ear tissue. In addition, semiquantitative observations were made in long-term deafened temporal bone material, focusing on the spiral ganglia and nerve projections, and a detailed study of the fine bone structure in macerated tissues was performed. Our main findings detail the presence of extensive bony fenestrae surrounding the nerve elements, permitting a relatively free flow of perilymph to modiolar structures. The clustering of the spiral ganglion cells in Rosenthal's canal and the detailed and intricate course of postganglionic axons are described. The close proximity of fibers to cell soma is demonstrated by impression in cell surfaces, and presence of small microvilli-like structures at the contact regions, anchoring nerve fibers to the cell wall. Extensive fenestrae and the presence of a fragile network of endosteal bony structures at the surfaces guiding nerve fibers are described in detail for the first time. This unique freshly prepared human material offers the opportunity for a detailed ultrastructural study not previously possible on postmortem fixed material and more accurate information to model electrostimulation of the human auditory nerve through a cochlear implant. On the basis of this study, we suggest that the concentration and high density of spiral ganglion cells, and the close physical interaction between neural elements, may explain the slow retrograde degeneration found in humans after loss of peripheral receptors. Moreover, the fragile bony columns connecting the spiral canal with the osseous spiral lamina may be a potential site for trauma in (perimodiolar) electrode positioning.
Collapse
Affiliation(s)
- Rudolf Glueckert
- Department of Otolaryngology, Institute of Anatomy and Histology, Medical University of Innsbruck, Innsbruck, Austria
| | | | | | | | | |
Collapse
|
106
|
Shannon RV. Speech and music have different requirements for spectral resolution. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2005; 70:121-34. [PMID: 16472633 DOI: 10.1016/s0074-7742(05)70004-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
107
|
Başkent D, Shannon RV. Frequency-place compression and expansion in cochlear implant listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 116:3130-3140. [PMID: 15603158 DOI: 10.1121/1.1804627] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
In multichannel cochlear implants, low frequency information is delivered to apical cochlear locations while high frequency information is delivered to more basal locations, mimicking the normal acoustic tonotopic organization of the auditory nerves. In clinical practice, little attention has been paid to the distribution of acoustic input across the electrodes of an individual patient that might vary in terms of spacing and absolute tonotopic location. In normal-hearing listeners, Başkent and Shannon (J. Acoust. Soc. Am. 113, 2003) simulated implant signal processing conditions in which the frequency range assigned to the array was systematically made wider or narrower than the simulated stimulation range in the cochlea, resulting in frequency-place compression or expansion, respectively. In general, the best speech recognition was obtained when the input acoustic information was delivered to the matching tonotopic place in the cochlea with least frequency-place distortion. The present study measured phoneme and sentence recognition scores with similar frequency-place manipulations in six Med-El Combi 40+ implant subjects. Stimulation locations were estimated using the Greenwood mapping function based on the estimated electrode insertion depth. Results from frequency-place compression and expansion with implants were similar to simulation results, especially for postlingually deafened subjects, despite the uncertainty in the actual stimulation sites of the auditory nerves. The present study shows that frequency-place mapping is an important factor in implant performance and an individual implant patient's map could be optimized with functional tests using frequency-place manipulations.
Collapse
Affiliation(s)
- Deniz Başkent
- Department of Biomedical Engineering, University of Southern California, Los Angeles, California 90089, USA.
| | | |
Collapse
|
108
|
Abstract
OBJECTIVE The first specific aim of the present study is to compare the ability of normal-hearing and cochlear implant listeners to use temporal cues in three music perception tasks: tempo discrimination, rhythmic pattern identification, and melody identification. The second aim is to identify the relative contribution of temporal and spectral cues to melody recognition in acoustic and electric hearing. DESIGN Both normal-hearing and cochlear implant listeners participated in the experiments. Tempo discrimination was measured in a two-interval forced-choice procedure in which subjects were asked to choose the faster tempo at four standard tempo conditions (60, 80, 100, and 120 beats per minute). For rhythmic pattern identification, seven different rhythmic patterns were created and subjects were asked to read and choose the musical notation displayed on the screen that corresponded to the rhythmic pattern presented. Melody identification was evaluated with two sets of 12 familiar melodies. One set contained both rhythm and melody information (rhythm condition), whereas the other set contained only melody information (no-rhythm condition). Melody stimuli were also processed to extract the slowly varying temporal envelope from 1, 2, 4, 8, 16, 32, and 64 frequency bands, to create cochlear implant simulations. Subjects listened to a melody and had to respond by choosing one of the 12 names corresponding to the melodies displayed on a computer screen. RESULTS In tempo discrimination, the cochlear implant listeners performed similarly to the normal-hearing listeners with rate discrimination difference limens obtained at 4-6 beats per minute. In rhythmic pattern identification, the cochlear implant listeners performed 5-25 percentage points poorer than the normal-hearing listeners. The normal-hearing listeners achieved perfect scores in melody identification with and without the rhythmic cues. However, the cochlear implant listeners performed significantly poorer than the normal-hearing listeners in both rhythm and no-rhythm conditions. The simulation results from normal-hearing listeners showed a relatively high level of performance for all numbers of frequency bands in the rhythm condition but required as many as 32 bands in the no-rhythm condition. CONCLUSIONS Cochlear-implant listeners performed normally in tempo discrimination, but significantly poorer than normal-hearing listeners in rhythmic pattern identification and melody recognition. While both temporal (rhythmic) and spectral (pitch) cues contribute to melody recognition, cochlear-implant listeners mostly relied on the rhythmic cues for melody recognition. Without the rhythmic cues, high spectral resolution with as many as 32 bands was needed for melody recognition for normal-hearing listeners. This result indicates that the present cochlear implants provide sufficient spectral cues to support speech recognition in quiet, but they are not adequate to support music perception. Increasing the number of functional channels and improved encoding of the fine structure information are necessary to improve music perception for cochlear implant listeners.
Collapse
Affiliation(s)
- Ying-Yee Kong
- Department of Cognitive Sciences, University of California, Irvine, 92697, USA.
| | | | | | | |
Collapse
|
109
|
Deeks JM, Carlyon RP. Simulations of cochlear implant hearing using filtered harmonic complexes: implications for concurrent sound segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 115:1736-1746. [PMID: 15101652 DOI: 10.1121/1.1675814] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Two experiments used simulations of cochlear implant hearing to investigate the use of temporal codes in speech segregation. Sentences were filtered into six bands, and their envelopes used to modulate filtered alternating-phase harmonic complexes with rates of 80 or 140 pps. Experiment 1 showed that identification of single sentences was better for the higher rate. In experiment 2, maskers (time-reversed concatenated sentences) were scaled by -9 dB relative to a target sentence, which was added with an offset of 1.2 s. When the target and masker were each processed on all six channels, and then summed, processing the masker on a different rate to the target improved performance only when the target rate was 140 pps. When the target sentence was processed on the odd-numbered channels and the masker on the even-numbered channels, or vice versa, performance was worse overall, but showed similar effects of pulse rate. The results, combined with recent psychophysical evidence, suggest that differences in pulse rate are unlikely to prove useful for concurrent sound segregation.
Collapse
Affiliation(s)
- John M Deeks
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 2EF, United Kingdom
| | | |
Collapse
|
110
|
Dorman MF, Ketten D. Adaptation by a cochlear-implant patient to upward shifts in the frequency representation of speech. Ear Hear 2004; 24:457-60. [PMID: 14534415 DOI: 10.1097/01.aud.0000090438.20404.d9] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The purpose of this project was to assess the degree to which a patient, after 1 wk of experience, could adapt to 3.2-mm and 6.8-mm basal shifts in the representation of speech. Only small deficits in performance were found after practice after the 3.2-mm shift. After practice after the 6.9-mm shift, scores on tests that emphasized amplitude envelope cues returned to baseline levels. Scores on vowel and sentence tests that emphasized frequency-based cues remained poor. Scores for "place," however, showed some recovery. Vowel recognition may be the limiting factor in recognizing basally shifted speech.
Collapse
Affiliation(s)
- Michael F Dorman
- Department of Speech and Hearing Science, Arizona State University, Tempe, AZ 85287-0102, USA.
| | | |
Collapse
|
111
|
Speech Perception with Cochlear Implants. COCHLEAR IMPLANTS: AUDITORY PROSTHESES AND ELECTRIC HEARING 2004. [DOI: 10.1007/978-0-387-22585-2_8] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
112
|
Kuchta J, Otto SR, Shannon RV, Hitselberger WE, Brackmann DE. The multichannel auditory brainstem implant: how many electrodes make sense? J Neurosurg 2004; 100:16-23. [PMID: 14743907 DOI: 10.3171/jns.2004.100.1.0016] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Object. Development of multichannel auditory brainstem implant (ABI) systems has been based in part on the assumption that audiological outcome can be optimized by increasing the number of available electrodes. In this paper the authors critically analyze this assumption on the basis of a retrospective clinical study performed using the Nucleus 22 ABI surface electrode array.
Methods. The perceptual performances of 61 patients with neurofibromatosis Type 2 were tested approximately 6 weeks after an eight-electrode ABI had been implanted. Of eight implanted electrodes 5.57 ± 2.57 (mean ± standard deviation [SD] provided auditory sensations when stimulated. Electrodes were deactivated when stimulation resulted in significant nonauditory side effects or no auditory sensation at all, and also when they failed to provide distinctive pitch sensations. The mean (± SD) scores for patients with ABIs were the following: sound-only consonant recognition, 20.4 ± 14.3 (range 0–65%); vowel recognition, 28.8 ± 18% (range 0–67%); Monosyllable Trochee Spondee (MTS) word recognition 41.1 ± 25.3% (range 0–100%); and sentence recognition, 5.3 ± 11.4% (range 0–64%). Performance in patients in whom between one and three electrodes provided auditory sensation was significantly poorer than that in patients with between four and eight functional electrodes in the vowel, MTS word, and City University of New York (CUNY) sentence recognition tests. The correlation between performance and electrode number did not reach the 0.05 level of significance with respect to the sound effect, consonant, and MTS stress-pattern recognition tests, probably because a satisfactory performance in these tests can be obtained only with temporal cues, that is, without any information about the frequency of the sounds. In the MTS word and the CUNY sentence recognition tests, performance was optimal in the patients with eight functional electrodes. Although all top performers had more than three functional auditory electrodes, no further improvement (asymptotic performance) was seen in those with five or more active electrodes in the consonant, vowel, and sound effect recognition tests.
Conclusions. A minimum of three spectral channels, programmed in the appropriate individual tonotopic order seem to be required for satisfactory speech recognition in most patients with ABI. Due to the limited access to the tonotopic frequency gradient of the cochlear nucleus with surface stimulation, patients with ABI do not receive a wide range of spectral cues (frequency information) with multielectrode (> 5) surface arrays.
Collapse
Affiliation(s)
- Johannes Kuchta
- Department of Neurosurgery, Cologne University, Cologne, Germany.
| | | | | | | | | |
Collapse
|
113
|
Au DKK. Effects of stimulation rates on Cantonese lexical tone perception by cochlear implant users in Hong Kong. Clin Otolaryngol 2003; 28:533-8. [PMID: 14616671 DOI: 10.1046/j.1365-2273.2003.00747.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
High, moderate and low stimulation rates of 1800, 800 and 400 pulse per second (pps)/channel, respectively, were used to test the effects of stimulation rates on the discrimination and identification of Cantonese lexical tones in 11 Chinese post-lingually deafened adults with cochlear implants (CIs). The subjects were implanted with the MED-EL Combi 40+ CI system. They were randomly assigned to each of the stimulation rate conditions according to an ABC design. In both the Cantonese lexical tone perception tests, the subjects reached the highest scores in the high-stimulation-rate condition, and the lowest scores in the low-stimulation-rate condition (P < 0.01). Post hoc comparisons between different stimulation rates did not yield consistent results. This study demonstrated that the maximum stimulation rate of 1800 pps/channel could be an 'optimal' stimulation rate and an informed choice of parameter for the benefit of Cantonese-speaking CI users in lexical tone perception.
Collapse
Affiliation(s)
- D K K Au
- Division of Otorhinolaryngology, Department of Surgery, University of Hong Kong Medical Centre, Queen Mary Hospital, Hong Kong SAR, China.
| |
Collapse
|
114
|
Baskent D, Shannon RV. Speech recognition under conditions of frequency-place compression and expansion. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:2064-2076. [PMID: 12703717 DOI: 10.1121/1.1558357] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
In normal acoustic hearing the mapping of acoustic frequency information onto the appropriate cochlear place is a natural biological function, but in cochlear implants it is controlled by the speech processor. The cochlear tonotopic range of the implant is determined by the length and insertion depth of the electrode array. Conventional cochlear implant electrode arrays are designed for an insertion of 25 mm inside the round window and the active electrodes occupy 16 mm, which would place the electrodes in a cochlear region corresponding to an acoustic frequency range of 500-6000 Hz. However, some implant speech processors map an acoustic frequency range from 150 to 10000 Hz onto these electrodes. While this mapping preserves the entire range of acoustic frequency information, it also results in a compression of the tonotopic pattern of speech information delivered to the brain. The present study measured the effects of such a compression of frequency-to-place mapping on speech recognition using acoustic simulations. Also measured were the effects of an expansion of the frequency-to-place mapping, which produces an expanded representation of speech in the cochlea. Such an expanded representation might improve speech recognition by improving the relative spatial (tonotopic) resolution, like an "acoustic fovea." Phoneme and sentence recognition was measured as a function of linear (in terms of cochlear distance) frequency-place compression and expansion. These conditions were presented to normal-hearing listeners using a noise-band vocoder, simulating cochlear implant electrodes with different insertion depths and different number of electrode channels. The cochlear tonotopic range was held constant by employing the same noise carrier bands for each condition, while the analysis frequency range was either compressed or expanded relative to the carrier frequency range. For each condition, the result was compared to that of the perfect frequency-place match, where the carrier and the analysis bands were perfectly matched. Speech recognition in the matched conditions was generally better than any condition of frequency-place expansion and compression, even when the matched condition eliminated a considerable amount of acoustic information. This result suggests that speech recognition, at least without training, is dependent on the mapping of acoustic frequency information onto the appropriate cochlear place.
Collapse
Affiliation(s)
- Deniz Baskent
- Department of Biomedical Engineering, University of Southern California, Los Angeles, California 90089, USA.
| | | |
Collapse
|
115
|
Fu QJ, Galvin JJ. The effects of short-term training for spectrally mismatched noise-band speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:1065-1072. [PMID: 12597199 DOI: 10.1121/1.1537708] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The present study examined the effects of short-term perceptual training on normal-hearing listeners' ability to adapt to spectrally altered speech patterns. Using noise-band vocoder processing, acoustic information was spectrally distorted by shifting speech information from one frequency region to another. Six subjects were tested with spectrally shifted sentences after five days of practice with upwardly shifted training sentences. Training with upwardly shifted sentences significantly improved recognition of upwardly shifted speech; recognition of downwardly shifted speech was nearly unchanged. Three subjects were later trained with downwardly shifted speech. Results showed that the mean improvement was comparable to that observed with the upwardly shifted training. In this retrain and retest condition, performance was largely unchanged for upwardly shifted sentence recognition, suggesting that these listeners had retained some of the improved speech perception resulting from the previous training. The results suggest that listeners are able to partially adapt to a spectral shift in acoustic speech patterns over the short-term, given sufficient training. However, the improvement was localized to where the spectral shift was trained, as no change in performance was observed for spectrally altered speech outside of the trained regions.
Collapse
Affiliation(s)
- Qian-Jie Fu
- Department of Auditory Implants and Perception, House Ear Institute, 2100 West Third Street, Los Angeles, California 90057, USA.
| | | |
Collapse
|
116
|
Faulkner A, Rosen S, Stanton D. Simulations of tonotopically mapped speech processors for cochlear implant electrodes varying in insertion depth. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:1073-1080. [PMID: 12597200 DOI: 10.1121/1.1536928] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
It has been claimed that speech recognition with a cochlear implant is dependent on the frequency alignment of analysis bands in the speech processor with characteristic frequencies (CFs) at electrode locations. However, the most apical electrode location can often have a CF of 1 kHz or more. The use of filters aligned in frequency to relatively basal electrode arrays leads to the loss of lower frequency speech information. This study simulates a frequency-aligned speech processor and common array insertion depths to assess this significance of this loss. Noise-excited vocoders simulated processors driving eight electrodes 2 mm apart. Analysis filters always had center frequencies matching the CFs of the simulated stimulation sites. The simulated insertion depth of the most apical electrode was varied in 2-mm steps between 25 mm (CF 502 Hz) and 17 mm (CF 1851 Hz) from the cochlear base. Identification of consonants, vowels, and words in sentences all showed a significant decline between each of the three more basal simulated electrode configurations. Thus, if implant processors used analysis filters frequency-aligned to electrode CFs, patients whose most apical electrode is 19 mm (CF 1.3 kHz) or less from the cochlear base would suffer a significant loss of speech information.
Collapse
Affiliation(s)
- Andrew Faulkner
- Department of Phonetics and Linguistics, UCL, Wolfson House, 4 Stephenson Way, London NW1 2HE, United Kingdom.
| | | | | |
Collapse
|
117
|
Ru P, Chi T, Shamma S. The synergy between speech production and perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:498-515. [PMID: 12558287 DOI: 10.1121/1.1525288] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Speech intelligibility is known to be relatively unaffected by certain deformations of the acoustic spectrum. These include translations, stretching or contracting dilations, and shearing of the spectrum (represented along the logarithmic frequency axis). It is argued here that such robustness reflects a synergy between vocal production and auditory perception. Thus, on the one hand, it is shown that these spectral distortions are produced by common and unavoidable variations among different speakers pertaining to the length, cross-sectional profile, and losses of their vocal tracts. On the other hand, it is argued that these spectral changes leave the auditory cortical representation of the spectrum largely unchanged except for translations along one of its representational axes. These assertions are supported by analyses of production and perception models. On the production side, a simplified sinusoidal model of the vocal tract is developed which analytically relates a few "articulatory" parameters, such as the extent and location of the vocal tract constriction, to the spectral peaks of the acoustic spectra synthesized from it. The model is evaluated by comparing the identification of synthesized sustained vowels to labeled natural vowels extracted from the TIMIT corpus. On the perception side a "multiscale" model of sound processing is utilized to elucidate the effects of the deformations on the representation of the acoustic spectrum in the primary auditory cortex. Finally, the implications of these results for the perception of generally identifiable classes of sound sources beyond the specific case of speech and the vocal tract are discussed.
Collapse
Affiliation(s)
- Powen Ru
- Center for Auditory and Acoustics Research, Institute for Systems Research, Electrical and Computer Engineering Department, University of Maryland, College Park, Maryland 20742, USA
| | | | | |
Collapse
|
118
|
Shannon RV. The relative importance of amplitude, temporal, and spectral cues for cochlear implant processor design. Am J Audiol 2002; 11:124-7. [PMID: 12691223 DOI: 10.1044/1059-0889(2002/013)] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Speech understanding with cochlear implants has improved steadily over the last 25 years, and the success of implants has provided a powerful tool for understanding speech recognition in general. Comparing speech recognition in normal-hearing listeners and in cochlear-implant listeners has revealed many important lessons about the types of information necessary for good speech recognition--and some of the lessons are surprising. This paper presents a summary of speech perception research over the last 25 years with cochlear-implant and normal-hearing listeners. As long as the speech is audible, even the relatively severe amplitude distortion has only a mild effect on intelligibility. Temporal cues appear to be useful for speech intelligibility only up to about 20 Hz. Whereas temporal information above 20 Hz may contribute to improved quality, it contributes little to speech understanding. In contrast, the quantity and quality of spectral information appear to be critical for speech understanding. Only four spectral "channels" of information can produce good speech understanding, but more channels are required for difficult listening situations. Speech understanding is sensitive to the placement of spectral information along the cochlea. In prosthetic devices, in which the spectral information can be delivered to any cochlear location, it is critical to present spectral information to the normal acoustic tonotopic location for that information. If there is a shift or distortion of 2 to 3 mm between frequency and cochlear place, speech recognition is decreased dramatically.
Collapse
Affiliation(s)
- Robert V Shannon
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, CA 90057, USA.
| |
Collapse
|
119
|
Green T, Faulkner A, Rosen S. Spectral and temporal cues to pitch in noise-excited vocoder simulations of continuous-interleaved-sampling cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 112:2155-2164. [PMID: 12430827 DOI: 10.1121/1.1506688] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Four-band and single-band noise-excited vocoders were used in acoustic simulations to investigate spectral and temporal cues to melodic pitch in the output of a cochlear implant speech processor. Noise carriers were modulated by amplitude envelopes extracted by half-wave rectification and low-pass filtering at 32 or 400 Hz. The four-band, but not the single-band processors, may preserve spectral correlates of fundamental frequency (F0). Envelope smoothing at 400 Hz preserves temporal correlates of F0, which are eliminated with 32-Hz smoothing. Inputs to the processors were sawtooth frequency glides, in which spectral variation is completely determined by F0, or synthetic diphthongal vowel glides, whose spectral shape is dominated by varying formant resonances. Normal listeners labeled the direction of pitch movement of the processed stimuli. For processed sawtooth waves, purely temporal cues led to decreasing performance with increasing F0. With purely spectral cues, performance was above chance despite the limited spectral resolution of the processors. For processed diphthongs, performance with purely spectral cues was at chance, showing that spectral envelope changes due to formant movement obscured spectral cues to F0. Performance with temporal cues was poorer for diphthongs than for sawtooths, with very limited discrimination at higher F0. These data suggest that, for speech signals through a typical cochlear implant processor, spectral cues to pitch are likely to have limited utility, while temporal envelope cues may be useful only at low F0.
Collapse
Affiliation(s)
- Tim Green
- Department of Phonetics and Linguistics, University College London, United Kingdom.
| | | | | |
Collapse
|
120
|
Fu QJ, Shannon RV, Galvin JJ. Perceptual learning following changes in the frequency-to-electrode assignment with the Nucleus-22 cochlear implant. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 112:1664-1674. [PMID: 12398471 DOI: 10.1121/1.1502901] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The goal of the present study was to investigate the time course of adaptation by experienced cochlear implant users to a shifted frequency-to-electrode assignment in their speech processors. Speech recognition performance of three Nucleus-22 cochlear implant users was measured over a 3-month period, during which the implant listeners continuously wore "experimental" speech processors that were purposely shifted by 2-4 mm in terms of the frequency-to-electrode assignment relative to their normal processor. Baseline speech performance was measured with each subject's clinically assigned speech processor just prior to implementation of the experimental processor. Baseline speech performance was measured again after the 3-month test period, immediately following the reinstallation of the clinically assigned processor settings. Speech performance with the experimental processor was measured four times during the first week, and weekly thereafter over the 3-month period. Results showed that the experimental processor produced significantly lower performance on all measures of speech recognition immediately following implementation. Over the 3-month test period, consonant and HINT sentence recognition with the experimental processors gradually approached a performance level comparable to but still significantly below the baseline and postexperiment measures made with the clinically assigned processor. However, vowel and TIMIT sentence recognition with the experimental processors remained far below the level of the baseline measures even at the end of the 3-month experimental period. There was no significant change in performance with the clinically assigned processor before or after fitting with the experimental processor. The results suggest that a long-time exposure to a new pattern of stimulation may not be able to compensate for the deficit in performance caused by a 2-4-mm shift in the tonotopic location of stimulation, at least within a 3-month period.
Collapse
Affiliation(s)
- Qian-Jie Fu
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.
| | | | | |
Collapse
|
121
|
Välimaa TT, Määttä TK, Löppönen HJ, Sorri MJ. Phoneme recognition and confusions with multichannel cochlear implants: vowels. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2002; 45:1039-1054. [PMID: 12381059 DOI: 10.1044/1092-4388(2002/084)] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The aim of this study was to investigate how postlingually severely or profoundly hearing-impaired adults relearn to recognize vowels after receiving multichannel cochlear implants. Vowel recognition of 19 Finnish-speaking subjects was studied for a minimum of 6 months and a maximum of 24 months using an open-set nonsense-syllable test in a prospective repeated-measure design. The responses were coded for phoneme errors, and 95% confidence intervals for recognition and confusions were calculated. The average vowel recognition was 68% (95% confidence interval = 66-70%) 6 months after switch-on and 80% (95% confidence interval = 78-82%) 24 months after switch-on. The vowels [ae], [u], [i], [o], and [a] were the easiest to recognize, and the vowels [y], [e], and [ø] were the most difficult. In conclusion, adaptation to electrical hearing using a multichannel cochlear implant was achieved well; but for at least 2 years, given two vowels with either F1 or F2 at roughly the some frequencies, confusions were drawn more towards the closest vowel with the next highest F1 or F2.
Collapse
Affiliation(s)
- Taina T Välimaa
- Department of Finnish, Saami and Logopedics University of Oulu.
| | | | | | | |
Collapse
|
122
|
Välimaa TT, Määttä TK, Löppönen HJ, Sorri MJ. Phoneme recognition and confusions with multichannel cochlear implants: consonants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2002; 45:1055-1069. [PMID: 12381060 DOI: 10.1044/1092-4388(2002/085)] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The aim of this study was to investigate how postlingually severely or profoundly hearing-impaired adults relearn to recognize consonants after receiving multichannel cochlear implants. Consonant recognition of 19 Finnish-speaking subjects was studied for a minimum of 6 months and a maximum of 24 months using an open-set nonsense-syllable test in a prospective repeated-measure design. Responses were coded for phoneme errors, and proportions of correct responses and 95% confidence intervals were calculated for recognition and confusions. Two years after the switch-on, the mean recognition of consonants was 71% (95% confidence interval = 68-73%). The manner of articulation was easier to classify than the place of articulation, and the consonants [s], [r], [k], [t], [p], [n], and [j] were easier to recognize than [h], [m], [l], and [v]. Adaptation to electrical hearing with a multichannel cochlear implant was successful, but consonants with alveolar, palatal, or velar transitions (high F2) were better recognized than consonants with labial transitions (low F2). The locus of the F2 transitions of the consonants with better recognition was at the frequencies 1.5-2 kHz, whereas the locus of the F2 transitions of the consonants with poorer recognition was at 1.2-1.4 kHz. A tendency to confuse consonants with the closest consonant with higher F2 transition was also noted.
Collapse
Affiliation(s)
- Taina T Välimaa
- Department of Finnish, Saami and Logopedics University of Oulu.
| | | | | | | |
Collapse
|
123
|
Abstract
OBJECTIVE To understand the short-term ("acute") effects of parametric variations to the frequency-to-electrode mapping on phoneme identification by Nucleus-22 cochlear implant listeners. METHODS Phoneme recognition was measured in five Nucleus-22 cochlear implant listeners using custom four-channel continuous interleaved sampler (CIS) processors. For the four-channel processors, speech signals were band-pass filtered into four broad frequency bands. The temporal envelope in each band was extracted by half-wave rectification and low-pass filtering at 160 Hz. The extracted envelope was then transformed to electric currents by a power function with an exponent of 0.2. The resulting electric currents were delivered to four electrode pairs (18,22), (13,17), (8,12), (3,7). The effect of frequency-to-electrode mapping was investigated by systematically varying the parameters of band-pass filters while fixing the electrode locations. Experiment 1 measured phoneme recognition as a function of the slope of band-pass filters. The slope of band-pass filters varied from 48 dB/octave to 6 dB/octave; the corner frequencies of band-pass filters were not varied. Experiment 2 measured phoneme recognition as a function of the distribution of band-pass filters across a fixed overall frequency range. The frequency divisions of a fixed overall frequency range were systematically varied from a logarithmic to a linear distribution. Experiment 3 measured phoneme recognition as a function of the bandwidth of the band-pass filters. The bandwidth of each filter varied from 0.2 to 2 octaves; the center frequencies for each band were not varied. No practice or feedback was provided for subjects in all experiments. RESULTS The slope of the band-pass filters had little effect on both vowel and consonant recognition. A slight performance drop was observed for only the shallowest slope condition (6 dB/octave). In contrast, the distribution of the band-pass filters had a strong effect on vowel recognition but a weak effect on consonant recognition. Best performance was achieved when a logarithmic or near-logarithmic frequency distribution was used to divide the overall frequency range. The bandwidth of the band-pass filters had a moderate effect on both vowel and consonant recognition. Vowel scores dropped significantly when the bandwidth of filters was too broad, whereas consonant scores dropped significantly when a narrower bandwidth was used. CONCLUSION Under "acute" testing conditions, phoneme recognition with a four-channel CIS strategy seems to be only mildly affected by the slope of the band-pass filters, but can be significantly affected by the distribution of filters as well as the bandwidth of the filters. Optimal or near-optimal performance can be achieved with a logarithmic frequency distribution. Vowels are more susceptible to broad bandwidths, whereas consonants are more susceptible to narrow bandwidths.
Collapse
Affiliation(s)
- Qian-Jie Fu
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA
| | | |
Collapse
|
124
|
Throckmorton CS, Collins LM. The effect of channel interactions on speech recognition in cochlear implant subjects: predictions from an acoustic model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 112:285-296. [PMID: 12141354 DOI: 10.1121/1.1482073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Acoustic models that produce speech signals with information content similar to that provided to cochlear implant users provide a mechanism by which to investigate the effect of various implant-specific processing or hardware parameters independent of other complicating factors. This study compares speech recognition of normal-hearing subjects listening through normal and impaired acoustic models of cochlear implant speech processors. The channel interactions that were simulated to impair the model were based on psychophysical data measured from cochlear implant subjects and include pitch reversals, indiscriminable electrodes, and forward masking effects. In general, spectral interactions degraded speech recognition more than temporal interactions. These effects were frequency dependent with spectral interactions that affect lower-frequency information causing the greatest decrease in speech recognition, and interactions that affect higher-frequency information having the least impact. The results of this study indicate that channel interactions, quantified psychophysically, affect speech recognition to different degrees. Investigation of the effects that channel interactions have on speech recognition may guide future research whose goal is compensating for psychophysically measured channel interactions in cochlear implant subjects.
Collapse
Affiliation(s)
- Chandra S Throckmorton
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27708-0291, USA
| | | |
Collapse
|
125
|
McKay CM, Henshall KR. Frequency-to-electrode allocation and speech perception with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 111:1036-1044. [PMID: 11863160 DOI: 10.1121/1.1436073] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The hypothesis was investigated that selectively increasing the discrimination of low-frequency information (below 2600 Hz) by altering the frequency-to-electrode allocation would improve speech perception by cochlear implantees. Two experimental conditions were compared, both utilizing ten electrode positions selected based on maximal discrimination. A fixed frequency range (200-10513 Hz) was allocated either relatively evenly across the ten electrodes, or so that nine of the ten positions were allocated to the frequencies up to 2600 Hz. Two additional conditions utilizing all available electrode positions (15-18 electrodes) were assessed: one with each subject's usual frequency-to-electrode allocation; and the other using the same analysis filters as the other experimental conditions. Seven users of the Nucleus CI22 implant wore processors mapped with each experimental condition for 2-week periods away from the laboratory, followed by assessment of perception of words in quiet and sentences in noise. Performance with both ten-electrode maps was significantly poorer than with both full-electrode maps on at least one measure. Performance with the map allocating nine out of ten electrodes to low frequencies was equivalent to that with the full-electrode maps for vowel perception and sentences in noise, but was worse for consonant perception. Performance with the evenly allocated ten-electrode map was equivalent to that with the full-electrode maps for consonant perception, but worse for vowel perception and sentences in noise. Comparison of the two full-electrode maps showed that subjects could fully adapt to frequency shifts up to ratio changes of 1.3, given 2 weeks' experience. Future research is needed to investigate whether speech perception may be improved by the manipulation of frequency-to-electrode allocation in maps which have a full complement of electrodes in Nucleus implants.
Collapse
Affiliation(s)
- Colette M McKay
- The University of Melbourne, Department of Otolaryngology, Parkville, Australia.
| | | |
Collapse
|
126
|
Friesen LM, Shannon RV, Baskent D, Wang X. Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001; 110:1150-63. [PMID: 11519582 DOI: 10.1121/1.1381538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Speech recognition was measured as a function of spectral resolution (number of spectral channels) and speech-to-noise ratio in normal-hearing (NH) and cochlear-implant (CI) listeners. Vowel, consonant, word, and sentence recognition were measured in five normal-hearing listeners, ten listeners with the Nucleus-22 cochlear implant, and nine listeners with the Advanced Bionics Clarion cochlear implant. Recognition was measured as a function of the number of spectral channels (noise bands or electrodes) at signal-to-noise ratios of + 15, + 10, +5, 0 dB, and in quiet. Performance with three different speech processing strategies (SPEAK, CIS, and SAS) was similar across all conditions, and improved as the number of electrodes increased (up to seven or eight) for all conditions. For all noise levels, vowel and consonant recognition with the SPEAK speech processor did not improve with more than seven electrodes, while for normal-hearing listeners, performance continued to increase up to at least 20 channels. Speech recognition on more difficult speech materials (word and sentence recognition) showed a marginally significant increase in Nucleus-22 listeners from seven to ten electrodes. The average implant score on all processing strategies was poorer than scores of NH listeners with similar processing. However, the best CI scores were similar to the normal-hearing scores for that condition (up to seven channels). CI listeners with the highest performance level increased in performance as the number of electrodes increased up to seven, while CI listeners with low levels of speech recognition did not increase in performance as the number of electrodes was increased beyond four. These results quantify the effect of number of spectral channels on speech recognition in noise and demonstrate that most CI subjects are not able to fully utilize the spectral information provided by the number of electrodes used in their implant.
Collapse
Affiliation(s)
- L M Friesen
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.
| | | | | | | |
Collapse
|
127
|
McDermott HJ, Dean MR. Speech perception with steeply sloping hearing loss: effects of frequency transposition. BRITISH JOURNAL OF AUDIOLOGY 2000; 34:353-61. [PMID: 11201322 DOI: 10.3109/03005364000000151] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Six adults with a very steeply sloping high-frequency hearing loss listened to monosyllabic words in several conditions. In the first condition, their ability to identify phonemes with a signal-to-noise ratio of 6 dB was measured. Results were similar to those of normally hearing subjects listening to the same material through low-pass filters having comparable cut-off frequencies. In the remaining two conditions, four of the hearing-impaired subjects, and a control group of five normally hearing subjects, listened to speech in quiet with and without frequency transposition. The transposition lowered all speech frequencies by a factor of 0.6. Specific auditory training with transposed speech materials different from the materials used in the tests of speech perception was provided in 10 sessions, each of one hour's duration, which were scheduled at weekly intervals. Despite this training, no significant differences were found between the two conditions in these subjects' recognition of words. It is concluded that such a frequency-transposition scheme, if implemented in a wearable hearing aid, would be unlikely to benefit people with a sloping hearing impairment of this type.
Collapse
Affiliation(s)
- H J McDermott
- Co-operative Research Centre for Cochlear Implant and Hearing Aid Innovation, East Melbourne, Australia.
| | | |
Collapse
|
128
|
Rosen S, Faulkner A, Wilkinson L. Adaptation by normal listeners to upward spectral shifts of speech: implications for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 1999; 106:3629-3636. [PMID: 10615701 DOI: 10.1121/1.428215] [Citation(s) in RCA: 189] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Multi-channel cochlear implants typically present spectral information to the wrong "place" in the auditory nerve array, because electrodes can only be inserted partway into the cochlea. Although such spectral shifts are known to cause large immediate decrements in performance in simulations, the extent to which listeners can adapt to such shifts has yet to be investigated. Here, the effects of a four-channel implant in normal listeners have been simulated, and performance tested with unshifted spectral information and with the equivalent of a 6.5-mm basalward shift on the basilar membrane (1.3-2.9 octaves, depending on frequency). As expected, the unshifted simulation led to relatively high levels of mean performance (e.g., 64% of words in sentences correctly identified) whereas the shifted simulation led to very poor results (e.g., 1% of words). However, after just nine 20-min sessions of connected discourse tracking with the shifted simulation, performance improved significantly for the identification of intervocalic consonants, medial vowels in monosyllables, and words in sentences (30% of words). Also, listeners were able to track connected discourse of shifted signals without lipreading at rates up to 40 words per minute. Although we do not know if complete adaptation to the shifted signals is possible, it is clear that short-term experiments seriously exaggerate the long-term consequences of such spectral shifts.
Collapse
Affiliation(s)
- S Rosen
- Department of Phonetics and Linguistics, University College London, England
| | | | | |
Collapse
|
129
|
Fu QJ, Shannon RV. Effects of electrode location and spacing on phoneme recognition with the Nucleus-22 cochlear implant. Ear Hear 1999; 20:321-31. [PMID: 10466568 DOI: 10.1097/00003446-199908000-00005] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE The objective of this paper was to determine how phoneme identification was affected by the cochlear location and spacing of the electrodes in cochlear implant listeners. DESIGN Subjects were initially programmed with the full complement of 20 active electrodes, in which each electrode was assigned to represent the output of one filter in the normal SPEAK processor. In the present study several four-electrode processors were constructed by assigning the output of more than one filter to a single electrode. In all conditions speech sounds were still analyzed into 20 frequency bands and processed according to the usual SPEAK processing strategy, but the location and spacing of the four stimulated electrode pairs were varied systematically. In Experiment I, the spacing between stimulated electrodes was fixed at 3.75 mm and the cochlear location of the four electrode pairs was shifted from the most-apical position up to 3.0 mm toward the base in 0.75 mm steps. In Experiment II, the spatial separation between the four electrode pairs (each bipolar-plus-one) was systematically changed from 1.5 mm to 4.5 mm while holding the most apical active electrode fixed. In Experiment III, the spacing of active electrodes was varied to represent equal tonotopic spacing to equal linear frequency intervals between pairs. Recognition of medial vowels and consonants was measured in three subjects with these custom four-electrode speech processors. RESULTS In Experiment I, results showed that both vowel and consonant recognition were best when the electrodes were in the most apical locations. In Experiment II, best speech recognition occurred when electrode pairs were separated by 3 to 3.75 mm. In Experiment III, both vowel and consonant recognition scores decreased when the spacing of electrode pairs was changed from equal tonotopic spacing to equal linear frequency intervals. Overall, vowel and consonant recognition were best at the most apical electrode locations and when the spacing of electrodes matched the frequency intervals of the analysis filters. Consonant recognition was relatively robust to alterations in electrode location and spacing. The best vowel scores with four-electrode speech processors were about 10 percentage lower than scores obtained with the full 20-electrode speech processors. However, the best consonant scores with four-electrode speech processors were similar to those obtained with the full 20-electrode speech processors. Information transmission analysis revealed that temporal envelope cues (voicing and manner) were not strongly affected by changes in electrode location and spacing, whereas spectral cues, as represented by vowel recognition and consonantal place of articulation, were strongly affected. Both spectral and temporal phoneme cues were strongly affected by the degree of tonotopic warping, created by altering both the location and spacing of the activated electrodes. CONCLUSION The cochlear location and spacing of the activated electrodes had a clear effect on phoneme recognition. Temporal cues were less affected by tonotopic shifts or linear tonotopic stretching or shrinking, but were susceptible to nonlinear tonotopic warping. Spectral cues were sensitive to all tonotopic manipulations: shifting, linear stretching, and nonlinear warping. However, the present experiments could not differentiate whether the optimal mapping between analysis frequency bands and stimulation electrodes was determined by the normal acoustic tonotopic pattern or by the pattern learned from experience with the 20-electrode implant.
Collapse
Affiliation(s)
- Q J Fu
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA
| | | |
Collapse
|
130
|
Fu QJ, Shannon RV. Effects of electrode configuration and frequency allocation on vowel recognition with the Nucleus-22 cochlear implant. Ear Hear 1999; 20:332-44. [PMID: 10466569 DOI: 10.1097/00003446-199908000-00006] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE This study was conducted to understand vowel recognition in cochlear implants as a function of the cochlear location and separation of the stimulated electrode pairs and as a function of the matching between speech spectral information and the location of the stimulated electrodes. DESIGN Four-electrode speech processors with a continuous interleaved sampling speech processing strategy were implemented through a custom interface in five subjects implanted with the Nucleus-22 cochlear implant. The temporal envelopes from four broad frequency bands were used to modulate 500 pps, 100 microsec/phase interleaved pulse trains delivered to four electrode pairs. Ten different frequency allocations and five sets of four-electrode configurations were tested. Each frequency allocation represented the same cochlear extent but different cochlear locations based on Greenwood's frequency-to-place formula. Recognition of multi-talker medial vowels was measured for each combination of parameters with no period of practice or adjustment. RESULTS Results showed that recognition of multi-talker vowels was highly dependent on frequency allocation for all electrode configurations. For a given electrode configuration maximum vowel recognition was observed with a specific frequency allocation. When the stimulated electrodes were shifted basally by 3 mm, the frequency allocation that produced the best performance also shifted basally by 3 mm. A similar pattern of vowel recognition was observed as a function of frequency allocation for electrode configurations that had the same apical-most electrode in each pair, regardless of location of the basal-most electrode in the pair. Subjects with different electrode insertion depths had similar trends in vowel recognition for each frequency allocation. CONCLUSIONS For a given electrode configuration, the best performance was obtained with processors with a specific frequency allocation. In addition, the apical-most member of each electrode pair had a much stronger influence on vowel recognition in electric hearing. Finally, results from this study also suggest that over time, patients with implants can partially adapt to a basal shift in place of stimulation.
Collapse
Affiliation(s)
- Q J Fu
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA
| | | |
Collapse
|