1
|
Shahidi LK, Collins LM, Mainsah BO. Objective intelligibility measurement of reverberant vocoded speech for normal-hearing listeners: Towards facilitating the development of speech enhancement algorithms for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2151-2168. [PMID: 38501923 PMCID: PMC10959555 DOI: 10.1121/10.0025285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 12/29/2023] [Accepted: 02/24/2024] [Indexed: 03/20/2024]
Abstract
Cochlear implant (CI) recipients often struggle to understand speech in reverberant environments. Speech enhancement algorithms could restore speech perception for CI listeners by removing reverberant artifacts from the CI stimulation pattern. Listening studies, either with cochlear-implant recipients or normal-hearing (NH) listeners using a CI acoustic model, provide a benchmark for speech intelligibility improvements conferred by the enhancement algorithm but are costly and time consuming. To reduce the associated costs during algorithm development, speech intelligibility could be estimated offline using objective intelligibility measures. Previous evaluations of objective measures that considered CIs primarily assessed the combined impact of noise and reverberation and employed highly accurate enhancement algorithms. To facilitate the development of enhancement algorithms, we evaluate twelve objective measures in reverberant-only conditions characterized by a gradual reduction of reverberant artifacts, simulating the performance of an enhancement algorithm during development. Measures are validated against the performance of NH listeners using a CI acoustic model. To enhance compatibility with reverberant CI-processed signals, measure performance was assessed after modifying the reference signal and spectral filterbank. Measures leveraging the speech-to-reverberant ratio, cepstral distance and, after modifying the reference or filterbank, envelope correlation are strong predictors of intelligibility for reverberant CI-processed speech.
Collapse
Affiliation(s)
- Lidea K Shahidi
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27701, USA
| | - Leslie M Collins
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27701, USA
| | - Boyla O Mainsah
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27701, USA
| |
Collapse
|
2
|
Chu K, Collins L, Mainsah B. Suppressing reverberation in cochlear implant stimulus patterns using time-frequency masks based on phoneme groups. PROCEEDINGS OF MEETINGS ON ACOUSTICS. ACOUSTICAL SOCIETY OF AMERICA 2022; 50:050002. [PMID: 38031629 PMCID: PMC10686264 DOI: 10.1121/2.0001698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Cochlear implant (CI) users experience considerable difficulty in understanding speech in reverberant listening environments. This issue is commonly addressed with time-frequency masking, where a time-frequency decomposed reverberant signal is multiplied by a matrix of gain values to suppress reverberation. However, mask estimation is challenging in reverberant environments due to the large spectro-temporal variations in the speech signal. To overcome this variability, we previously developed a phoneme-based algorithm that selects a different mask estimation model based on the underlying phoneme. In the ideal case where knowledge of the phoneme was assumed, the phoneme-based approach provided larger benefits than a phoneme-independent approach when tested in normal-hearing listeners using an acoustic model of CI processing. The current work investigates the phoneme-based mask estimation algorithm in the real-time feasible case where the prediction from a phoneme classifier is used to select the phoneme-specific mask. To further ensure real-time feasibility, both the phoneme classifier and mask estimation algorithm use causal features extracted from within the CI processing framework. We conducted experiments in normal-hearing listeners using an acoustic model of CI processing, and the results showed that the phoneme-specific algorithm benefitted the majority of subjects.
Collapse
Affiliation(s)
- Kevin Chu
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27705
| | - Leslie Collins
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27705
| | - Boyla Mainsah
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27705
| |
Collapse
|
3
|
Shahidi LK, Collins LM, Mainsah BO. Parameter tuning of time-frequency masking algorithms for reverberant artifact removal within the cochlear implant stimulus. Cochlear Implants Int 2022; 23:309-316. [PMID: 35875863 PMCID: PMC9611765 DOI: 10.1080/14670100.2022.2096182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Abstract
Cochlear implant recipients struggle to understand speech in reverberant environments. To restore speech perception, artifacts due to reverberant reflections can be removed from the cochlear implant stimulus by applying a matrix of gain values, a technique referred to as time-frequency masking. In this study, two common time-frequency masking strategies are implemented within cochlear implant processing, either introducing complete retention or deletion of stimulus components using a binary mask or continuous attenuation of stimulus components using a ratio mask. Parameters of each masking strategy control the level of attenuation imposed by the gain values. In this study, we perceptually tune the parameters of the masking strategy to determine a balance between speech retention and artifact removal. We measure the intelligibility of reverberant signals mitigated by each strategy with speech recognition testing in normal-hearing listeners using vocoding as a simulation of cochlear implant perception. For both masking strategies, we find parameterizations that maximize the intelligibility of the mitigated signals. At the best-performing parameterizations, binary-masked reverberant signals yield larger intelligibility improvements than ratio-masked signals. The results provide a perceptually optimized objective for the removal of reverberant artifacts from cochlear implant stimuli, facilitating improved speech recognition performance for cochlear implant recipients in reverberant environments.
Collapse
Affiliation(s)
- Lidea K Shahidi
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Leslie M Collins
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Boyla O Mainsah
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| |
Collapse
|
4
|
Hussain T, Siniscalchi SM, Wang HLS, Tsao Y, Salerno VM, Liao WH. Ensemble Hierarchical Extreme Learning Machine for Speech Dereverberation. IEEE Trans Cogn Dev Syst 2020. [DOI: 10.1109/tcds.2019.2953620] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
5
|
Chu KM, Throckmorton CS, Collins LM, Mainsah BO. USING MACHINE LEARNING TO MITIGATE THE EFFECTS OF REVERBERATION AND NOISE IN COCHLEAR IMPLANTS. PROCEEDINGS OF MEETINGS ON ACOUSTICS. ACOUSTICAL SOCIETY OF AMERICA 2018; 33:050003. [PMID: 32582407 PMCID: PMC7314361 DOI: 10.1121/2.0000905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In listening environments with room reverberation and background noise, cochlear implant (CI) users experience substantial difficulties in understanding speech. Because everyday environments have different combinations of reverberation and noise, there is a need to develop algorithms that can mitigate both effects to improve speech intelligibility. Desmond et al. (2014) developed a machine learning approach to mitigate the adverse effects of late reverberant reflections of speech signals by using a classifier to detect and remove affected segments in CI pulse trains. This study aimed to investigate the robustness of the reverberation mitigation algorithm in environments with both reverberation and noise. Sentence recognition tests were conducted in normal hearing listeners using vocoded speech with unmitigated and mitigated reverberant-only or noisy reverberant speech signals, across different reverberation times and noise types. Improvements in speech intelligibility were observed in mitigated reverberant-only conditions. However, mixed results were obtained in the mitigated noisy reverberant conditions as a reduction in speech intelligibility was observed for noise types whose spectra were similar to that of anechoic speech. Based on these results, the focus of future work is to develop a context-dependent approach that activates different mitigation strategies for different acoustic environments.
Collapse
Affiliation(s)
- Kevin M Chu
- Department of Biomedical Engineering, Duke University, Durham, NC
| | | | - Leslie M Collins
- Department of Biomedical Engineering, Duke University, Durham, NC
- Department of Electrical and Computer Engineering, Duke University, Durham, NC
| | - Boyla O Mainsah
- Department of Electrical and Computer Engineering, Duke University, Durham, NC
| |
Collapse
|
6
|
Hazrati O, Ali H, Hansen JHL, Tobey E. Evaluation and analysis of whispered speech for cochlear implant users: Gender identification and intelligibility. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:74-79. [PMID: 26233008 PMCID: PMC4491014 DOI: 10.1121/1.4922230] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Revised: 02/27/2015] [Accepted: 05/19/2015] [Indexed: 06/04/2023]
Abstract
This study investigates the degree to which whispered speech impacts speech perception and gender identification in cochlear implant (CI) users. Listening experiments with six CI subjects under neutral and whispered speech conditions using sentences from the UT-Vocal Effort II corpus (recordings from male and female speakers) were conducted. Results indicated a significant effect of whispering on gender identification and speech intelligibility scores. In addition, no significant effect of talker gender on the speech/gender identification scores was observed. Results also suggested that exposure to longer speech stimuli, and consequently more temporal cues, would not improve gender identification performance in CI subjects.
Collapse
Affiliation(s)
- Oldooz Hazrati
- Department of Electrical Engineering, The University of Texas at Dallas, Richardson, Texas 75080, USA
| | - Hussnain Ali
- Department of Electrical Engineering, The University of Texas at Dallas, Richardson, Texas 75080, USA
| | - John H L Hansen
- Department of Electrical Engineering, The University of Texas at Dallas, Richardson, Texas 75080, USA
| | - Emily Tobey
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas 75080, USA
| |
Collapse
|
7
|
Desmond JM, Collins LM, Throckmorton CS. The effects of reverberant self- and overlap-masking on speech recognition in cochlear implant listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:EL304-EL310. [PMID: 24907838 DOI: 10.1121/1.4879673] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Many cochlear implant (CI) listeners experience decreased speech recognition in reverberant environments [Kokkinakis et al., J. Acoust. Soc. Am. 129(5), 3221-3232 (2011)], which may be caused by a combination of self- and overlap-masking [Bolt and MacDonald, J. Acoust. Soc. Am. 21(6), 577-580 (1949)]. Determining the extent to which these effects decrease speech recognition for CI listeners may influence reverberation mitigation algorithms. This study compared speech recognition with ideal self-masking mitigation, with ideal overlap-masking mitigation, and with no mitigation. Under these conditions, mitigating either self- or overlap-masking resulted in significant improvements in speech recognition for both normal hearing subjects utilizing an acoustic model and for CI listeners using their own devices.
Collapse
Affiliation(s)
- Jill M Desmond
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27708 , ,
| | - Leslie M Collins
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27708 , ,
| | - Chandra S Throckmorton
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27708 , ,
| |
Collapse
|