1
|
Cychosz M, Winn MB, Goupell MJ. How to vocode: Using channel vocoders for cochlear-implant research. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2407-2437. [PMID: 38568143 PMCID: PMC10994674 DOI: 10.1121/10.0025274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 02/14/2024] [Accepted: 02/23/2024] [Indexed: 04/05/2024]
Abstract
The channel vocoder has become a useful tool to understand the impact of specific forms of auditory degradation-particularly the spectral and temporal degradation that reflect cochlear-implant processing. Vocoders have many parameters that allow researchers to answer questions about cochlear-implant processing in ways that overcome some logistical complications of controlling for factors in individual cochlear implant users. However, there is such a large variety in the implementation of vocoders that the term "vocoder" is not specific enough to describe the signal processing used in these experiments. Misunderstanding vocoder parameters can result in experimental confounds or unexpected stimulus distortions. This paper highlights the signal processing parameters that should be specified when describing vocoder construction. The paper also provides guidance on how to determine vocoder parameters within perception experiments, given the experimenter's goals and research questions, to avoid common signal processing mistakes. Throughout, we will assume that experimenters are interested in vocoders with the specific goal of better understanding cochlear implants.
Collapse
Affiliation(s)
- Margaret Cychosz
- Department of Linguistics, University of California, Los Angeles, Los Angeles, California 90095, USA
| | - Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, College Park, Maryland 20742, USA
| |
Collapse
|
2
|
Shader MJ, Kwon BJ, Gordon-Salant S, Goupell MJ. Open-Set Phoneme Recognition Performance With Varied Temporal Cues in Younger and Older Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1196-1211. [PMID: 35133853 PMCID: PMC9150732 DOI: 10.1044/2021_jslhr-21-00299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 09/20/2021] [Accepted: 11/12/2021] [Indexed: 06/14/2023]
Abstract
PURPOSE The goal of this study was to investigate the effect of age on phoneme recognition performance in which the stimuli varied in the amount of temporal information available in the signal. Chronological age is increasingly recognized as a factor that can limit the amount of benefit an individual can receive from a cochlear implant (CI). Central auditory temporal processing deficits in older listeners may contribute to the performance gap between younger and older CI users on recognition of phonemes varying in temporal cues. METHOD Phoneme recognition was measured at three stimulation rates (500, 900, and 1800 pulses per second) and two envelope modulation frequencies (50 Hz and unfiltered) in 20 CI participants ranging in age from 27 to 85 years. Speech stimuli were multiple word pairs differing in temporal contrasts and were presented via direct stimulation of the electrode array using an eight-channel continuous interleaved sampling strategy. Phoneme recognition performance was evaluated at each stimulation rate condition using both envelope modulation frequencies. RESULTS Duration of deafness was the strongest subject-level predictor of phoneme recognition, with participants with longer durations of deafness having poorer performance overall. Chronological age did not predict performance for any stimulus condition. Additionally, duration of deafness interacted with envelope filtering. Participants with shorter durations of deafness were able to take advantage of higher frequency envelope modulations, while participants with longer durations of deafness were not. CONCLUSIONS Age did not significantly predict phoneme recognition performance. In contrast, longer durations of deafness were associated with a reduced ability to utilize available temporal information within the signal to improve phoneme recognition performance.
Collapse
Affiliation(s)
- Maureen J. Shader
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN
| | | | | | - Matthew J. Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| |
Collapse
|
3
|
Martin IA, Goupell MJ, Huang YT. Children's syntactic parsing and sentence comprehension with a degraded auditory signal. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:699. [PMID: 35232101 PMCID: PMC8816517 DOI: 10.1121/10.0009271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 10/15/2021] [Accepted: 12/16/2021] [Indexed: 06/14/2023]
Abstract
During sentence comprehension, young children anticipate syntactic structures using early-arriving words and have difficulties revising incorrect predictions using late-arriving words. However, nearly all work to date has focused on syntactic parsing in idealized speech environments, and little is known about how children's strategies for predicting and revising meanings are affected by signal degradation. This study compares comprehension of active and passive sentences in natural and vocoded speech. In a word-interpretation task, 5-year-olds inferred the meanings of novel words in sentences that (1) encouraged agent-first predictions (e.g., The blicket is eating the seal implies The blicket is the agent), (2) required revising predictions (e.g., The blicket is eaten by the seal implies The blicket is the theme), or (3) weakened predictions by placing familiar nouns in sentence-initial position (e.g., The seal is eating/eaten by the blicket). When novel words promoted agent-first predictions, children misinterpreted passives as actives, and errors increased with vocoded compared to natural speech. However, when familiar words were sentence-initial that weakened agent-first predictions, children accurately interpreted passives, with no signal-degradation effects. This demonstrates that signal quality interacts with interpretive processes during sentence comprehension, and the impacts of speech degradation are greatest when late-arriving information conflicts with predictions.
Collapse
Affiliation(s)
- Isabel A Martin
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Yi Ting Huang
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
4
|
Berg KA, Noble JH, Dawant BM, Dwyer RT, Labadie RF, Gifford RH. Speech recognition as a function of the number of channels for an array with large inter-electrode distances. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2752. [PMID: 33940865 PMCID: PMC8062138 DOI: 10.1121/10.0004244] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 03/22/2021] [Accepted: 03/22/2021] [Indexed: 05/28/2023]
Abstract
This study investigated the number of channels available to cochlear implant (CI) recipients for maximum speech understanding and sound quality for lateral wall electrode arrays-which result in large electrode-to-modiolus distances-featuring the greatest inter-electrode distances (2.1-2.4 mm), the longest active lengths (23.1-26.4 mm), and the fewest number of electrodes commercially available. Participants included ten post-lingually deafened adult CI recipients with MED-EL electrode arrays (FLEX28 and STANDARD) entirely within scala tympani. Electrode placement and scalar location were determined using computerized tomography. The number of channels was varied from 4 to 12 with equal spatial distribution across the array. A continuous interleaved sampling-based strategy was used. Speech recognition, sound quality ratings, and a closed-set vowel recognition task were measured acutely for each electrode condition. Participants did not demonstrate statistically significant differences beyond eight channels at the group level for almost all measures. However, several listeners showed considerable improvements from 8 to 12 channels for speech and sound quality measures. These results suggest that channel interaction caused by the greater electrode-to-modiolus distances of straight electrode arrays could be partially compensated for by a large inter-electrode distance or spacing.
Collapse
Affiliation(s)
- Katelyn A Berg
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Nashville, Tennessee 37232, USA
| | - Jack H Noble
- Department of Electrical Engineering and Computer Science, Vanderbilt University, 2201 West End Avenue, Nashville, Tennessee 37235, USA
| | - Benoit M Dawant
- Department of Electrical Engineering and Computer Science, Vanderbilt University, 2201 West End Avenue, Nashville, Tennessee 37235, USA
| | - Robert T Dwyer
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Nashville, Tennessee 37232, USA
| | - Robert F Labadie
- Department of Otolaryngology, Vanderbilt University Medical Center, 1215 21st Avenue South, Nashville, Tennessee 37232, USA
| | - René H Gifford
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Nashville, Tennessee 37232, USA
| |
Collapse
|
5
|
Goupell MJ, Draves GT, Litovsky RY. Recognition of vocoded words and sentences in quiet and multi-talker babble with children and adults. PLoS One 2020; 15:e0244632. [PMID: 33373427 PMCID: PMC7771688 DOI: 10.1371/journal.pone.0244632] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 12/14/2020] [Indexed: 11/18/2022] Open
Abstract
A vocoder is used to simulate cochlear-implant sound processing in normal-hearing listeners. Typically, there is rapid improvement in vocoded speech recognition, but it is unclear if the improvement rate differs across age groups and speech materials. Children (8–10 years) and young adults (18–26 years) were trained and tested over 2 days (4 hours) on recognition of eight-channel noise-vocoded words and sentences, in quiet and in the presence of multi-talker babble at signal-to-noise ratios of 0, +5, and +10 dB. Children achieved poorer performance than adults in all conditions, for both word and sentence recognition. With training, vocoded speech recognition improvement rates were not significantly different between children and adults, suggesting that improvement in learning how to process speech cues degraded via vocoding is absent of developmental differences across these age groups and types of speech materials. Furthermore, this result confirms that the acutely measured age difference in vocoded speech recognition persists after extended training.
Collapse
Affiliation(s)
- Matthew J. Goupell
- Department of Hearing and Speech Sciences, University of Maryland, Maryland, MD, United States of America
- * E-mail:
| | - Garrison T. Draves
- Waisman Center, University of Wisconsin, Madison, WI, United States of America
| | - Ruth Y. Litovsky
- Waisman Center, University of Wisconsin, Madison, WI, United States of America
- Department of Communication Sciences and Disorders, University of Wisconsin, Madison, WI, United States of America
| |
Collapse
|
6
|
Age-Related Differences in the Processing of Temporal Envelope and Spectral Cues in a Speech Segment. Ear Hear 2018; 38:e335-e342. [PMID: 28562426 DOI: 10.1097/aud.0000000000000447] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES As people age, they experience reduced temporal processing abilities. This results in poorer ability to understand speech, particularly for degraded input signals. Cochlear implants (CIs) convey speech information via the temporal envelopes of a spectrally degraded input signal. Because there is an increasing number of older CI users, there is a need to understand how temporal processing changes with age. Therefore, the goal of this study was to quantify age-related reduction in temporal processing abilities when attempting to discriminate words based on temporal envelope information from spectrally degraded signals. DESIGN Younger normal-hearing (YNH) and older normal-hearing (ONH) participants were presented a continuum of speech tokens that varied in silence duration between phonemes (0 to 60 ms in 10-ms steps), and were asked to identify whether the stimulus was perceived more as the word "dish" or "ditch." Stimuli were vocoded using tonal carriers. The number of channels (1, 2, 4, 8, 16, and unprocessed) and temporal envelope low-pass filter cutoff frequency (50 and 400 Hz) were systematically varied. RESULTS For the unprocessed conditions, the YNH participants perceived the word ditch for smaller silence durations than the ONH participants, indicating that aging affects temporal processing abilities. There was no difference in performance between the unprocessed and 16-channel, 400-Hz vocoded stimuli. Decreasing the number of spectral channels caused decreased ability to distinguish dish and ditch. Decreasing the envelope cutoff frequency also caused decreased ability to distinguish dish and ditch. The overall pattern of results revealed that reductions in spectral and temporal information had a relatively larger effect on the ONH participants compared with the YNH participants. CONCLUSIONS Aging reduces the ability to utilize brief temporal cues in speech segments. Reducing spectral information-as occurs in a channel vocoder and in CI speech processing strategies-forces participants to use temporal envelope information; however, older participants are less capable of utilizing this information. These results suggest that providing as much spectral and temporal speech information as possible would benefit older CI users relatively more than younger CI users. In addition, the present findings help set expectations of clinical outcomes for speech understanding performance by adult CI users as a function of age.
Collapse
|
7
|
Chang SA, Won JH, Kim H, Oh SH, Tyler RS, Cho CH. Frequency-Limiting Effects on Speech and Environmental Sound Identification for Cochlear Implant and Normal Hearing Listeners. J Audiol Otol 2018; 22:28-38. [PMID: 29325391 PMCID: PMC5784366 DOI: 10.7874/jao.2017.00178] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Revised: 10/09/2017] [Accepted: 10/17/2017] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND AND OBJECTIVES It is important to understand the frequency region of cues used, and not used, by cochlear implant (CI) recipients. Speech and environmental sound recognition by individuals with CI and normal-hearing (NH) was measured. Gradients were also computed to evaluate the pattern of change in identification performance with respect to the low-pass filtering or high-pass filtering cutoff frequencies. SUBJECTS AND METHODS Frequency-limiting effects were implemented in the acoustic waveforms by passing the signals through low-pass filters (LPFs) or high-pass filters (HPFs) with seven different cutoff frequencies. Identification of Korean vowels and consonants produced by a male and female speaker and environmental sounds was measured. Crossover frequencies were determined for each identification test, where the LPF and HPF conditions show the identical identification scores. RESULTS CI and NH subjects showed changes in identification performance in a similar manner as a function of cutoff frequency for the LPF and HPF conditions, suggesting that the degraded spectral information in the acoustic signals may similarly constraint the identification performance for both subject groups. However, CI subjects were generally less efficient than NH subjects in using the limited spectral information for speech and environmental sound identification due to the inefficient coding of acoustic cues through the CI sound processors. CONCLUSIONS This finding will provide vital information in Korean for understanding how different the frequency information is in receiving speech and environmental sounds by CI processor from normal hearing.
Collapse
Affiliation(s)
- Son-A Chang
- Department of Otolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul, Korea
| | - Jong Ho Won
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, USA
| | - HyangHee Kim
- Graduate Program of Speech and Language Pathology, Department and Research Institute of Rehabilitation Medicine, Yonsei University College of Medicine, Seoul, Korea
| | - Seung-Ha Oh
- Department of Otolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul, Korea
| | - Richard S Tyler
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, USA
| | - Chang Hyun Cho
- Department of Otolaryngology-Head and Neck Surgery, Gachon University Gil Medical Center, Incheon, Korea
| |
Collapse
|
8
|
|
9
|
Ehlers E, Kan A, Winn MB, Stoelb C, Litovsky RY. Binaural hearing in children using Gaussian enveloped and transposed tones. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:1724. [PMID: 27106319 PMCID: PMC4826377 DOI: 10.1121/1.4945588] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Children who use bilateral cochlear implants (BiCIs) show significantly poorer sound localization skills than their normal hearing (NH) peers. This difference has been attributed, in part, to the fact that cochlear implants (CIs) do not faithfully transmit interaural time differences (ITDs) and interaural level differences (ILDs), which are known to be important cues for sound localization. Interestingly, little is known about binaural sensitivity in NH children, in particular, with stimuli that constrain acoustic cues in a manner representative of CI processing. In order to better understand and evaluate binaural hearing in children with BiCIs, the authors first undertook a study on binaural sensitivity in NH children ages 8-10, and in adults. Experiments evaluated sound discrimination and lateralization using ITD and ILD cues, for stimuli with robust envelope cues, but poor representation of temporal fine structure. Stimuli were spondaic words, Gaussian-enveloped tone pulse trains (100 pulse-per-second), and transposed tones. Results showed that discrimination thresholds in children were adult-like (15-389 μs for ITDs and 0.5-6.0 dB for ILDs). However, lateralization based on the same binaural cues showed higher variability than seen in adults. Results are discussed in the context of factors that may be responsible for poor representation of binaural cues in bilaterally implanted children.
Collapse
Affiliation(s)
- Erica Ehlers
- University of Wisconsin-Madison, Waisman Center, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| | - Alan Kan
- University of Wisconsin-Madison, Waisman Center, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| | - Matthew B Winn
- University of Wisconsin-Madison, Waisman Center, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| | - Corey Stoelb
- University of Wisconsin-Madison, Waisman Center, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| | - Ruth Y Litovsky
- University of Wisconsin-Madison, Waisman Center, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| |
Collapse
|
10
|
Caldwell M, Rankin SK, Jiradejvong P, Carver C, Limb CJ. Cochlear implant users rely on tempo rather than on pitch information during perception of musical emotion. Cochlear Implants Int 2015; 16 Suppl 3:S114-20. [DOI: 10.1179/1467010015z.000000000265] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
11
|
Venail F, Mathiolon C, Menjot de Champfleur S, Piron JP, Sicard M, Villemus F, Vessigaud MA, Sterkers-Artieres F, Mondain M, Uziel A. Effects of Electrode Array Length on Frequency-Place Mismatch and Speech Perception with Cochlear Implants. Audiol Neurootol 2015; 20:102-11. [DOI: 10.1159/000369333] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2014] [Accepted: 10/20/2014] [Indexed: 11/19/2022] Open
Abstract
Frequency-place mismatch often occurs after cochlear implantation, yet its effect on speech perception outcome remains unclear. In this article, we propose a method, based on cochlea imaging, to determine the cochlear place-frequency map. We evaluated the effect of frequency-place mismatch on speech perception outcome in subjects implanted with 3 different lengths of electrode arrays. A deeper insertion was responsible for a larger frequency-place mismatch and a decreased and delayed speech perception improvement by comparison with a shallower insertion, for which a similar but slighter effect was noticed. Our results support the notion that selecting an electrode array length adapted to each individual's cochlear anatomy may reduce frequency-place mismatch and thus improve speech perception outcome.
Collapse
|
12
|
Whitmal NA, DeMaio D, Lin R. Effects of envelope bandwidth on importance functions for cochlear implant simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:733-744. [PMID: 25698008 DOI: 10.1121/1.4906260] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Frequency-importance functions (FIFs) quantify intelligibility contributions of spectral regions of speech. In previous work, FIFs were considered as instruments for characterizing intelligibility contributions of individual cochlear implant electrode channels. Comparisons of FIFs for natural speech and vocoder-simulated implant processed speech showed that vocoding shifted peak importance regions downward in frequency by 0.5 octaves. These shifts were attributed to voicing cue changes, and may reflect increased reliance on low-frequency information (apart from periodicity cues) for correct voicing perception. The purpose of this study was to determine whether increasing channel envelope bandwidth would reverse these shifts by improving access to voicing and pitch cues. Importance functions were measured for 48 subjects with normal hearing, who listened to vowel-consonant-vowel tokens either as recorded or as output from five different vocoders that simulated implant processing. Envelopes were constructed using filters that either included or excluded pitch information. Results indicate that vocoding-based shifts are only partially counteracted by including pitch information; moreover, a substantial baseline shift is present even for vocoders with high spectral resolution. The results also suggest that vocoded speech intelligibility is most sensitive to a loss of spectral resolution in high-importance regions, a finding with possible implications for cochlear implant electrode mapping.
Collapse
Affiliation(s)
- Nathaniel A Whitmal
- Department of Communication Disorders, University of Massachusetts, Amherst, Massachusetts 01003
| | - Decia DeMaio
- Department of Communication Disorders, University of Massachusetts, Amherst, Massachusetts 01003
| | - Rongheng Lin
- Division of Biostatistics and Epidemiology, School of Public Health and Health Sciences, University of Massachusetts, Amherst, Massachusetts 01003
| |
Collapse
|
13
|
Grasmeder ML, Verschuur CA, Batty VB. Optimizing frequency-to-electrode allocation for individual cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:3313. [PMID: 25480076 DOI: 10.1121/1.4900831] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Individual adjustment of frequency-to-electrode assignment in cochlear implants (CIs) may potentially improve speech perception outcomes. Twelve adult CI users were recruited for an experiment, in which frequency maps were adjusted using insertion angles estimated from post-operative x rays; results were analyzed for ten participants with good quality x rays. The allocations were a mapping to the Greenwood function, a compressed map limited to the area containing spiral ganglion (SG) cells, a reduced frequency range map (RFR), and participants' clinical maps. A trial period of at least six weeks was given for the clinical, Greenwood, and SG maps although participants could return to their clinical map if they wished. Performance with the Greenwood map was poor for both sentence and vowel perception and correlated with insertion angle; performance with the SG map was poorer than for the clinical map. The RFR map was significantly better than the clinical map for three participants, for sentence perception, but worse for three others. Those with improved performance had relatively deep insertions and poor electrode discrimination ability for apical electrodes. The results suggest that CI performance could be improved by adjustment of the frequency allocation, based on a measure of insertion angle and/or electrode discrimination ability.
Collapse
Affiliation(s)
- Mary L Grasmeder
- Auditory Implant Service, Faculty of Engineering and the Environment, Building 19, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Carl A Verschuur
- Auditory Implant Service, Faculty of Engineering and the Environment, Building 19, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Vincent B Batty
- Wessex Neurological Centre, University Hospital Southampton NHS Foundation Trust, Tremona Road, Southampton SO16 6YD, United Kingdom
| |
Collapse
|
14
|
Yoon YS, Shin YR, Fu QJ. Binaural benefit with and without a bilateral spectral mismatch in acoustic simulations of cochlear implant processing. Ear Hear 2013; 34:273-9. [PMID: 22968427 DOI: 10.1097/aud.0b013e31826709e8] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES This study investigated whether a spectral mismatch across ears influences the benefit of redundancy, squelch, and head shadow differently in speech perception using acoustic simulation of bilateral cochlear implant (CI) processing. DESIGN Ten normal-hearing subjects participated in the study, and acoustic simulations of CIs were used to test these subjects. Sentence recognition, presented unilaterally and bilaterally, was measured at +5 dB and +10 dB signal-to-noise ratios (SNRs) with bilaterally matched and mismatched conditions. Unilateral and bilateral CIs were simulated using 8-channel sine wave vocoders. Binaural spectral mismatch was introduced by changing the relative simulated insertion depths across ears. Subjects were tested while listening with headphones; head-related transfer functions were applied before the vocoder processing to preserve natural interaural level and time differences. RESULTS For both SNRs, greater and more consistent binaural benefit of squelch and redundancy occurred for the matched condition whereas binaural interference of squelch and redundancy occurred for the mismatched condition. However, significant binaural benefit of head shadow existed irrespective of spectral mismatches and SNRs. CONCLUSIONS The results suggest that bilateral spectral mismatch may have a negative impact on the binaural benefit of squelch and redundancy for bilateral CI users. The results also suggest that clinical mapping should be carefully administrated for bilateral CI users to minimize the difference in spectral patterns between the two CIs.
Collapse
Affiliation(s)
- Yang-Soo Yoon
- Division of Communication and Auditory Neuroscience, House Research Institute, Los Angeles, CA, USA
| | | | | |
Collapse
|
15
|
Majdak P, Walder T, Laback B. Effect of long-term training on sound localization performance with spectrally warped and band-limited head-related transfer functions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:2148-2159. [PMID: 23967945 DOI: 10.1121/1.4816543] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Sound localization in the sagittal planes, including the ability to distinguish front from back, relies on spectral features caused by the filtering effects of the head, pinna, and torso. It is assumed that important spatial cues are encoded in the frequency range between 4 and 16 kHz. In this study, in a double-blind design and using audio-visual training covering the full 3-D space, normal-hearing listeners were trained 2 h per day over three weeks to localize sounds which were either band limited up to 8.5 kHz or spectrally warped from the range between 2.8 and 16 kHz to the range between 2.8 and 8.5 kHz. The training effect for the warped condition exceeded that for procedural task learning, suggesting a stable auditory recalibration due to the training. After the training, performance with band-limited sounds was better than that with warped ones. The results show that training can improve sound localization in cases where spectral cues have been reduced by band-limiting or remapped by warping. This suggests that hearing-impaired listeners, who have limited access to high frequencies, might also improve their localization ability when provided with spectrally warped or band-limited sounds and adequately trained on sound localization.
Collapse
Affiliation(s)
- Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria.
| | | | | |
Collapse
|
16
|
Landwehr M, Fürstenberg D, Walger M, von Wedel H, Meister H. Effects of various electrode configurations on music perception, intonation and speaker gender identification. Cochlear Implants Int 2013; 15:27-35. [PMID: 23684531 DOI: 10.1179/1754762813y.0000000037] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
Advances in speech coding strategies and electrode array designs for cochlear implants (CIs) predominantly aim at improving speech perception. Current efforts are also directed at transmitting appropriate cues of the fundamental frequency (F0) to the auditory nerve with respect to speech quality, prosody, and music perception. The aim of this study was to examine the effects of various electrode configurations and coding strategies on speech intonation identification, speaker gender identification, and music quality rating. In six MED-EL CI users electrodes were selectively deactivated in order to simulate different insertion depths and inter-electrode distances when using the high definition continuous interleaved sampling (HDCIS) and fine structure processing (FSP) speech coding strategies. Identification of intonation and speaker gender was determined and music quality rating was assessed. For intonation identification HDCIS was robust against the different electrode configurations, whereas fine structure processing showed significantly worse results when a short electrode depth was simulated. In contrast, speaker gender recognition was not affected by electrode configuration or speech coding strategy. Music quality rating was sensitive to electrode configuration. In conclusion, the three experiments revealed different outcomes, even though they all addressed the reception of F0 cues. Rapid changes in F0, as seen with intonation, were the most sensitive to electrode configurations and coding strategies. In contrast, electrode configurations and coding strategies did not show large effects when F0 information was available over a longer time period, as seen with speaker gender. Music quality relies on additional spectral cues other than F0, and was poorest when a shallow insertion was simulated.
Collapse
|
17
|
Whitmal NA, DeRoy K. Use of an adaptive-bandwidth protocol to measure importance functions for simulated cochlear implant frequency channels. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:1359-1370. [PMID: 22352509 PMCID: PMC3292607 DOI: 10.1121/1.3672684] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2011] [Revised: 11/17/2011] [Accepted: 11/18/2011] [Indexed: 05/29/2023]
Abstract
The Articulation Index and Speech Intelligibility Index predict intelligibility scores from measurements of speech and hearing parameters. One component in the prediction is the frequency-importance function, a weighting function that characterizes contributions of particular spectral regions of speech to speech intelligibility. The purpose of this study was to determine whether such importance functions could similarly characterize contributions of electrode channels in cochlear implant systems. Thirty-eight subjects with normal hearing listened to vowel-consonant-vowel tokens, either as recorded or as output from vocoders that simulated aspects of cochlear implant processing. Importance functions were measured using the method of Whitmal and DeRoy [J. Acoust. Soc. Am. 130, 4032-4043 (2011)], in which signal bandwidths were varied adaptively to produce specified token recognition scores in accordance with the transformed up-down rules of Levitt [J. Acoust. Soc. Am. 49, 467-477 (1971)]. Psychometric functions constructed from recognition scores were subsequently converted into importance functions. Comparisons of the resulting importance functions indicate that vocoder processing causes peak importance regions to shift downward in frequency. This shift is attributed to changes in strategy and capability for detecting voicing in speech, and is consistent with previously measured data.
Collapse
Affiliation(s)
- Nathaniel A Whitmal
- Department of Communication Disorders, University of Massachusetts, Amherst, Massachusetts 01003, USA.
| | | |
Collapse
|
18
|
Abstract
OBJECTIVES This review examines evidence for potential benefits of using cochlear implant electrodes that extend into the apical regions of the cochlea. Most cochlear implant systems use electrode arrays that extend 1 to 1.5 turns from the basal cochleostomy, but one manufacturer (MED-EL GmbH) uses an electrode array that is considerably longer. The fundamental rationale for using electrodes extending toward the apex of the cochlea is to provide additional low-pitched auditory percepts and thereby increase the spectral information available to the user. Several experimental long arrays have also been produced by other manufacturers to assess potential benefits of this approach. DESIGN In addition to assessing the effects of deeply inserted electrodes on performance, this review examines several underlying and associated issues, including cochlear anatomy, electrode design, surgical considerations (including insertion trauma), and pitch scaling trials. Where possible, the aim is to draw conclusions regarding the potential from apical electrodes in general, rather than relating to the performance of specific and current devices. RESULTS Imaging studies indicate that currently available electrode arrays rarely extend more than two turns into the cochlea, the mean insertion angle for full insertions of the MED-EL electrodes being about 630°. This is considerably shorter than the total length of the cochlea and more closely approximates the length of the spiral ganglion. Anatomical considerations, and some modelling studies, suggest that fabrication of even longer electrodes is unlikely to provide additional spectral information. The issue of potential benefit from the most apical electrodes, therefore, is whether they are able to selectively stimulate discrete and tonotopically ordered neural populations near the apex of the spiral ganglion, where the ganglion cells are closely grouped. Pitch scaling studies, using the MED-EL and experimental long arrays, suggest that this is achieved in many cases, but that a significant number of individuals show evidence of pitch confusions or reversals among the most apical electrodes, presumably reducing potential performance benefit and presenting challenges for processor programming. CONCLUSIONS Benefits in terms of speech recognition and other performance measures are less clear. Several studies have indicated that deactivation of apical electrodes results in poorer speech recognition performance, but these have been mostly acute studies where the subjects have been accustomed to the full complement of electrodes, thus making interpretation difficult. Some chronic studies have suggested that apical electrodes do provide additional performance benefit, but others have shown performance improvement after deactivating some of the apical electrodes. Whether or not deeply inserted electrodes can offer performance benefits, there is evidence that currently available designs tend to produce more intracochlear trauma than shorter arrays, in terms of loss of residual acoustic hearing and reduction of the neural substrate. This may have important long-term consequences for the user. Furthermore, as it is possible that subjects with better low-frequency residual hearing are more likely to benefit from the inclusion of apical electrodes, there may be a potential clinical dilemma as the same subjects are those most likely to benefit from bimodal electroacoustic stimulation, requiring a relatively shallow insertion.
Collapse
|
19
|
Goupell MJ, Majdak P, Laback B. Median-plane sound localization as a function of the number of spectral channels using a channel vocoder. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:990-1001. [PMID: 20136221 PMCID: PMC3061453 DOI: 10.1121/1.3283014] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2009] [Revised: 12/04/2009] [Accepted: 12/09/2009] [Indexed: 05/16/2023]
Abstract
Using a vocoder, median-plane sound localization performance was measured in eight normal-hearing listeners as a function of the number of spectral channels. The channels were contiguous and logarithmically spaced in the range from 0.3 to 16 kHz. Acutely testing vocoded stimuli showed significantly worse localization compared to noises and 100 pulses click trains, both of which were tested after feedback training. However, localization for the vocoded stimuli was better than chance. A second experiment was performed using two different 12-channel spacings for the vocoded stimuli, now including feedback training. One spacing was from experiment 1. The second spacing (called the speech-localization spacing) assigned more channels to the frequency range associated with speech. There was no significant difference in localization between the two spacings. However, even with training, localizing 12-channel vocoded stimuli remained worse than localizing virtual wideband noises by 4.8 degrees in local root-mean-square error and 5.2% in quadrant error rate. Speech understanding for the speech-localization spacing was not significantly different from that for a typical spacing used by cochlear-implant users. These experiments suggest that current cochlear implants have a sufficient number of spectral channels for some vertical-plane sound localization capabilities, albeit worse than normal-hearing listeners, without loss of speech understanding.
Collapse
Affiliation(s)
- Matthew J Goupell
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria.
| | | | | |
Collapse
|