1
|
Choi I, Gander PE, Berger JI, Woo J, Choy MH, Hong J, Colby S, McMurray B, Griffiths TD. Spectral Grouping of Electrically Encoded Sound Predicts Speech-in-Noise Performance in Cochlear Implantees. J Assoc Res Otolaryngol 2023; 24:607-617. [PMID: 38062284 PMCID: PMC10752853 DOI: 10.1007/s10162-023-00918-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 11/14/2023] [Indexed: 12/29/2023] Open
Abstract
OBJECTIVES Cochlear implant (CI) users exhibit large variability in understanding speech in noise. Past work in CI users found that spectral and temporal resolution correlates with speech-in-noise ability, but a large portion of variance remains unexplained. Recent work on normal-hearing listeners showed that the ability to group temporally and spectrally coherent tones in a complex auditory scene predicts speech-in-noise ability independently of the audiogram, highlighting a central mechanism for auditory scene analysis that contributes to speech-in-noise. The current study examined whether the auditory grouping ability also contributes to speech-in-noise understanding in CI users. DESIGN Forty-seven post-lingually deafened CI users were tested with psychophysical measures of spectral and temporal resolution, a stochastic figure-ground task that depends on the detection of a figure by grouping multiple fixed frequency elements against a random background, and a sentence-in-noise measure. Multiple linear regression was used to predict sentence-in-noise performance from the other tasks. RESULTS No co-linearity was found between any predictor variables. All three predictors (spectral and temporal resolution plus the figure-ground task) exhibited significant contribution in the multiple linear regression model, indicating that the auditory grouping ability in a complex auditory scene explains a further proportion of variance in CI users' speech-in-noise performance that was not explained by spectral and temporal resolution. CONCLUSION Measures of cross-frequency grouping reflect an auditory cognitive mechanism that determines speech-in-noise understanding independently of cochlear function. Such measures are easily implemented clinically as predictors of CI success and suggest potential strategies for rehabilitation based on training with non-speech stimuli.
Collapse
Affiliation(s)
- Inyong Choi
- Department of Communication Sciences and Disorders, University of Iowa, 250 Hawkins Dr., Iowa City, IA, 52242, USA.
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA.
| | - Phillip E Gander
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
- Department of Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
- Department of Radiology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Joel I Berger
- Department of Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Jihwan Woo
- Department of Biomedical Engineering, University of Ulsan, Ulsan, Republic of Korea
| | - Matthew H Choy
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
| | - Jean Hong
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Sarah Colby
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, 52242, USA
| | - Bob McMurray
- Department of Communication Sciences and Disorders, University of Iowa, 250 Hawkins Dr., Iowa City, IA, 52242, USA
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, 52242, USA
| | - Timothy D Griffiths
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
| |
Collapse
|
2
|
Intensive Training of Spatial Hearing Promotes Auditory Abilities of Bilateral Cochlear Implant Adults: A Pilot Study. Ear Hear 2023; 44:61-76. [PMID: 35943235 DOI: 10.1097/aud.0000000000001256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
OBJECTIVE The aim of this study was to evaluate the feasibility of a virtual reality-based spatial hearing training protocol in bilateral cochlear implant (CI) users and to provide pilot data on the impact of this training on different qualities of hearing. DESIGN Twelve bilateral CI adults aged between 19 and 69 followed an intensive 10-week rehabilitation program comprised eight virtual reality training sessions (two per week) interspersed with several evaluation sessions (2 weeks before training started, after four and eight training sessions, and 1 month after the end of training). During each 45-minute training session, participants localized a sound source whose position varied in azimuth and/or in elevation. At the start of each trial, CI users received no information about sound location, but after each response, feedback was given to enable error correction. Participants were divided into two groups: a multisensory feedback group (audiovisual spatial cue) and an unisensory group (visual spatial cue) who only received feedback in a wholly intact sensory modality. Training benefits were measured at each evaluation point using three tests: 3D sound localization in virtual reality, the French Matrix test, and the Speech, Spatial and other Qualities of Hearing questionnaire. RESULTS The training was well accepted and all participants attended the whole rehabilitation program. Four training sessions spread across 2 weeks were insufficient to induce significant performance changes, whereas performance on all three tests improved after eight training sessions. Front-back confusions decreased from 32% to 14.1% ( p = 0.017); speech recognition threshold score from 1.5 dB to -0.7 dB signal-to-noise ratio ( p = 0.029) and eight CI users successfully achieved a negative signal-to-noise ratio. One month after the end of structured training, these performance improvements were still present, and quality of life was significantly improved for both self-reports of sound localization (from 5.3 to 6.7, p = 0.015) and speech understanding (from 5.2 to 5.9, p = 0.048). CONCLUSIONS This pilot study shows the feasibility and potential clinical relevance of this type of intervention involving a sensorial immersive environment and could pave the way for more systematic rehabilitation programs after cochlear implantation.
Collapse
|
3
|
Nonlinguistic Outcome Measures in Adult Cochlear Implant Users Over the First Year of Implantation. Ear Hear 2018; 37:354-64. [PMID: 26656317 DOI: 10.1097/aud.0000000000000261] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Postlingually deaf cochlear implant users' speech perception improves over several months after implantation due to a learning process which involves integration of the new acoustic information presented by the device. Basic tests of hearing acuity might evaluate sensitivity to the new acoustic information and be less sensitive to learning effects. It was hypothesized that, unlike speech perception, basic spectral and temporal discrimination abilities will not change over the first year of implant use. If there were limited change over time and the test scores were correlated with clinical outcome, the tests might be useful for acute diagnostic assessments of hearing ability and also useful for testing speakers of any language, many of which do not have validated speech tests. DESIGN Ten newly implanted cochlear implant users were tested for speech understanding in quiet and in noise at 1 and 12 months postactivation. Spectral-ripple discrimination, temporal-modulation detection, and Schroeder-phase discrimination abilities were evaluated at 1, 3, 6, 9, and 12 months postactivation. RESULTS Speech understanding in quiet improved between 1 and 12 months postactivation (mean 8% improvement). Speech in noise performance showed no statistically significant improvement. Mean spectral-ripple discrimination thresholds and temporal-modulation detection thresholds for modulation frequencies of 100 Hz and above also showed no significant improvement. Spectral-ripple discrimination thresholds were significantly correlated with speech understanding. Low FM detection and Schroeder-phase discrimination abilities improved over the period. Individual learning trends varied, but the majority of listeners followed the same stable pattern as group data. CONCLUSIONS Spectral-ripple discrimination ability and temporal-modulation detection at 100-Hz modulation and above might serve as a useful diagnostic tool for early acute assessment of cochlear implant outcome for listeners speaking any native language.
Collapse
|
4
|
Léger AC, Reed CM, Desloge JG, Swaminathan J, Braida LD. Consonant identification in noise using Hilbert-transform temporal fine-structure speech and recovered-envelope speech for listeners with normal and impaired hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:389-403. [PMID: 26233038 PMCID: PMC4514718 DOI: 10.1121/1.4922949] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 04/07/2015] [Accepted: 06/11/2015] [Indexed: 05/31/2023]
Abstract
Consonant-identification ability was examined in normal-hearing (NH) and hearing-impaired (HI) listeners in the presence of steady-state and 10-Hz square-wave interrupted speech-shaped noise. The Hilbert transform was used to process speech stimuli (16 consonants in a-C-a syllables) to present envelope cues, temporal fine-structure (TFS) cues, or envelope cues recovered from TFS speech. The performance of the HI listeners was inferior to that of the NH listeners both in terms of lower levels of performance in the baseline condition and in the need for higher signal-to-noise ratio to yield a given level of performance. For NH listeners, scores were higher in interrupted noise than in steady-state noise for all speech types (indicating substantial masking release). For HI listeners, masking release was typically observed for TFS and recovered-envelope speech but not for unprocessed and envelope speech. For both groups of listeners, TFS and recovered-envelope speech yielded similar levels of performance and consonant confusion patterns. The masking release observed for TFS and recovered-envelope speech may be related to level effects associated with the manner in which the TFS processing interacts with the interrupted noise signal, rather than to the contributions of TFS cues per se.
Collapse
Affiliation(s)
- Agnès C Léger
- School of Psychological Sciences, University of Manchester, Manchester, M13 9PL, United Kingdom
| | - Charlotte M Reed
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Joseph G Desloge
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Jayaganesh Swaminathan
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Louis D Braida
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
5
|
Moon IJ, Hong SH. What is temporal fine structure and why is it important? KOREAN JOURNAL OF AUDIOLOGY 2014; 18:1-7. [PMID: 24782944 PMCID: PMC4003734 DOI: 10.7874/kja.2014.18.1.1] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Revised: 04/02/2014] [Accepted: 04/03/2014] [Indexed: 12/01/2022]
Abstract
Complex sound like speech can be characterized as the sum of number of amplitude-modulated signals representing the outputs of an array of narrow frequency bands. Temporal information at the output of each band can be separated into temporal fine structure (TFS), the rapid oscillations close to the center frequency and temporal envelope (ENV), slower amplitude modulations superimposed on the TFS. TFS information can be carried in the pattern of phase locking to the stimulus waveform, while ENV by the changes in firing rate over time. The relative importance of temporal ENV and TFS information in understanding speech has been studied using various sound-processing techniques. A number of studies demonstrated that ENV cues are associated with speech recognition in quiet, while TFS cues are possibly linked to melody/pitch perception and listening to speech in a competing background. However, there are evidences that recovered ENV from TFS as well as TFS itself may be partially responsible for speech recognition. Current technologies used in cochlear implants (CI) are not efficient in delivering the TFS cues, and new attempts have been made to deliver TFS information into sound-processing strategy in CI. We herein discuss the current updated findings of TFS with a literature review.
Collapse
Affiliation(s)
- Il Joon Moon
- Department of Otorhinolaryngology-Head and Neck Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Sung Hwa Hong
- Department of Otorhinolaryngology-Head and Neck Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| |
Collapse
|
6
|
Swaminathan J, Reed CM, Desloge JG, Braida LD, Delhorne LA. Consonant identification using temporal fine structure and recovered envelope cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:2078-2090. [PMID: 25235005 PMCID: PMC4167752 DOI: 10.1121/1.4865920] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2013] [Revised: 01/29/2014] [Accepted: 02/03/2014] [Indexed: 05/31/2023]
Abstract
The contribution of recovered envelopes (RENVs) to the utilization of temporal-fine structure (TFS) speech cues was examined in normal-hearing listeners. Consonant identification experiments used speech stimuli processed to present TFS or RENV cues. Experiment 1 examined the effects of exposure and presentation order using 16-band TFS speech and 40-band RENV speech recovered from 16-band TFS speech. Prior exposure to TFS speech aided in the reception of RENV speech. Performance on the two conditions was similar (∼50%-correct) for experienced listeners as was the pattern of consonant confusions. Experiment 2 examined the effect of varying the number of RENV bands recovered from 16-band TFS speech. Mean identification scores decreased as the number of RENV bands decreased from 40 to 8 and were only slightly above chance levels for 16 and 8 bands. Experiment 3 examined the effect of varying the number of bands in the TFS speech from which 40-band RENV speech was constructed. Performance fell from 85%- to 31%-correct as the number of TFS bands increased from 1 to 32. Overall, these results suggest that the interpretation of previous studies that have used TFS speech may have been confounded with the presence of RENVs.
Collapse
Affiliation(s)
- Jayaganesh Swaminathan
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| | - Charlotte M Reed
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| | - Joseph G Desloge
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| | - Louis D Braida
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| | - Lorraine A Delhorne
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| |
Collapse
|
7
|
Won JH, Shim HJ, Lorenzi C, Rubinstein JT. Use of amplitude modulation cues recovered from frequency modulation for cochlear implant users when original speech cues are severely degraded. J Assoc Res Otolaryngol 2014; 15:423-39. [PMID: 24532186 DOI: 10.1007/s10162-014-0444-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2013] [Accepted: 01/20/2014] [Indexed: 11/30/2022] Open
Abstract
Won et al. (J Acoust Soc Am 132:1113-1119, 2012) reported that cochlear implant (CI) speech processors generate amplitude-modulation (AM) cues recovered from broadband speech frequency modulation (FM) and that CI users can use these cues for speech identification in quiet. The present study was designed to extend this finding for a wide range of listening conditions, where the original speech cues were severely degraded by manipulating either the acoustic signals or the speech processor. The manipulation of the acoustic signals included the presentation of background noise, simulation of reverberation, and amplitude compression. The manipulation of the speech processor included changing the input dynamic range and the number of channels. For each of these conditions, multiple levels of speech degradation were tested. Speech identification was measured for CI users and compared for stimuli having both AM and FM information (intact condition) or FM information only (FM condition). Each manipulation degraded speech identification performance for both intact and FM conditions. Performance for the intact and FM conditions became similar for stimuli having the most severe degradations. Identification performance generally overlapped for the intact and FM conditions. Moreover, identification performance for the FM condition was better than chance performance even at the maximum level of distortion. Finally, significant correlations were found between speech identification scores for the intact and FM conditions. Altogether, these results suggest that despite poor frequency selectivity, CI users can make efficient use of AM cues recovered from speech FM in difficult listening situations.
Collapse
Affiliation(s)
- Jong Ho Won
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, 37996, USA
| | | | | | | |
Collapse
|
8
|
Imennov NS, Won JH, Drennan WR, Jameyson E, Rubinstein JT. Detection of acoustic temporal fine structure by cochlear implant listeners: behavioral results and computational modeling. Hear Res 2013; 298:60-72. [PMID: 23333260 PMCID: PMC3605703 DOI: 10.1016/j.heares.2013.01.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Revised: 12/22/2012] [Accepted: 01/08/2013] [Indexed: 10/27/2022]
Abstract
A test of within-channel detection of acoustic temporal fine structure (aTFS) cues is presented. Eight cochlear implant listeners (CI) were asked to discriminate between two Schroeder-phase (SP) complexes using a two-alternative, forced-choice task. Because differences between the acoustic stimuli are primarily constrained to their aTFS, successful discrimination reflects a combination of the subjects' perception of and the strategy's ability to deliver aTFS cues. Subjects were mapped with single-channel Continuous Interleaved Sampling (CIS) and Simultaneous Analog Stimulation (SAS) strategies. To compare within- and across- channel delivery of aTFS cues, a 16-channel clinical HiRes strategy was also fitted. Throughout testing, SAS consistently outperformed the CIS strategy (p ≤ 0.002). For SP stimuli with F0 = 50 Hz, the highest discrimination scores were achieved with the HiRes encoding, followed by scores with the SAS and the CIS strategies, respectively. At 200 Hz, single-channel SAS performed better than HiRes (p = 0.022), demonstrating that under a more challenging testing condition, discrimination performance with a single-channel analog encoding can exceed that of a 16-channel pulsatile strategy. To better understand the intermediate steps of discrimination, a biophysical model was used to examine the neural discharges evoked by the SP stimuli. Discrimination estimates calculated from simulated neural responses successfully tracked the behavioral performance trends of single-channel CI listeners.
Collapse
Affiliation(s)
- Nikita S. Imennov
- Department of Bioengineering, University of Washington, Seattle, WA 98195
- VM Bloedel Hearing Research Center, University of Washington, Seattle, WA 98195
| | - Jong Ho Won
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN 37996
| | - Ward R. Drennan
- VM Bloedel Hearing Research Center, University of Washington, Seattle, WA 98195
- Department of Otolaryngology, Head & Neck Surgery, University of Washington, Seattle, WA 98195
| | - Elyse Jameyson
- VM Bloedel Hearing Research Center, University of Washington, Seattle, WA 98195
- Department of Otolaryngology, Head & Neck Surgery, University of Washington, Seattle, WA 98195
| | - Jay T. Rubinstein
- Department of Bioengineering, University of Washington, Seattle, WA 98195
- VM Bloedel Hearing Research Center, University of Washington, Seattle, WA 98195
- Department of Otolaryngology, Head & Neck Surgery, University of Washington, Seattle, WA 98195
| |
Collapse
|