1
|
Aldag N, Nogueira W. Psychoacoustic and electroencephalographic responses to changes in amplitude modulation depth and frequency in relation to speech recognition in cochlear implantees. Sci Rep 2024; 14:8181. [PMID: 38589483 PMCID: PMC11002021 DOI: 10.1038/s41598-024-58225-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 03/26/2024] [Indexed: 04/10/2024] Open
Abstract
Temporal envelope modulations (TEMs) are one of the most important features that cochlear implant (CI) users rely on to understand speech. Electroencephalographic assessment of TEM encoding could help clinicians to predict speech recognition more objectively, even in patients unable to provide active feedback. The acoustic change complex (ACC) and the auditory steady-state response (ASSR) evoked by low-frequency amplitude-modulated pulse trains can be used to assess TEM encoding with electrical stimulation of individual CI electrodes. In this study, we focused on amplitude modulation detection (AMD) and amplitude modulation frequency discrimination (AMFD) with stimulation of a basal versus an apical electrode. In twelve adult CI users, we (a) assessed behavioral AMFD thresholds and (b) recorded cortical auditory evoked potentials (CAEPs), AMD-ACC, AMFD-ACC, and ASSR in a combined 3-stimulus paradigm. We found that the electrophysiological responses were significantly higher for apical than for basal stimulation. Peak amplitudes of AMFD-ACC were small and (therefore) did not correlate with speech-in-noise recognition. We found significant correlations between speech-in-noise recognition and (a) behavioral AMFD thresholds and (b) AMD-ACC peak amplitudes. AMD and AMFD hold potential to develop a clinically applicable tool for assessing TEM encoding to predict speech recognition in CI users.
Collapse
Affiliation(s)
- Nina Aldag
- Department of Otolaryngology, Hannover Medical School and Cluster of Excellence 'Hearing4all', Hanover, Germany
| | - Waldo Nogueira
- Department of Otolaryngology, Hannover Medical School and Cluster of Excellence 'Hearing4all', Hanover, Germany.
| |
Collapse
|
2
|
Choi I, Gander PE, Berger JI, Woo J, Choy MH, Hong J, Colby S, McMurray B, Griffiths TD. Spectral Grouping of Electrically Encoded Sound Predicts Speech-in-Noise Performance in Cochlear Implantees. J Assoc Res Otolaryngol 2023; 24:607-617. [PMID: 38062284 PMCID: PMC10752853 DOI: 10.1007/s10162-023-00918-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 11/14/2023] [Indexed: 12/29/2023] Open
Abstract
OBJECTIVES Cochlear implant (CI) users exhibit large variability in understanding speech in noise. Past work in CI users found that spectral and temporal resolution correlates with speech-in-noise ability, but a large portion of variance remains unexplained. Recent work on normal-hearing listeners showed that the ability to group temporally and spectrally coherent tones in a complex auditory scene predicts speech-in-noise ability independently of the audiogram, highlighting a central mechanism for auditory scene analysis that contributes to speech-in-noise. The current study examined whether the auditory grouping ability also contributes to speech-in-noise understanding in CI users. DESIGN Forty-seven post-lingually deafened CI users were tested with psychophysical measures of spectral and temporal resolution, a stochastic figure-ground task that depends on the detection of a figure by grouping multiple fixed frequency elements against a random background, and a sentence-in-noise measure. Multiple linear regression was used to predict sentence-in-noise performance from the other tasks. RESULTS No co-linearity was found between any predictor variables. All three predictors (spectral and temporal resolution plus the figure-ground task) exhibited significant contribution in the multiple linear regression model, indicating that the auditory grouping ability in a complex auditory scene explains a further proportion of variance in CI users' speech-in-noise performance that was not explained by spectral and temporal resolution. CONCLUSION Measures of cross-frequency grouping reflect an auditory cognitive mechanism that determines speech-in-noise understanding independently of cochlear function. Such measures are easily implemented clinically as predictors of CI success and suggest potential strategies for rehabilitation based on training with non-speech stimuli.
Collapse
Affiliation(s)
- Inyong Choi
- Department of Communication Sciences and Disorders, University of Iowa, 250 Hawkins Dr., Iowa City, IA, 52242, USA.
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA.
| | - Phillip E Gander
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
- Department of Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
- Department of Radiology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Joel I Berger
- Department of Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Jihwan Woo
- Department of Biomedical Engineering, University of Ulsan, Ulsan, Republic of Korea
| | - Matthew H Choy
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
| | - Jean Hong
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Sarah Colby
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, 52242, USA
| | - Bob McMurray
- Department of Communication Sciences and Disorders, University of Iowa, 250 Hawkins Dr., Iowa City, IA, 52242, USA
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, 52242, USA
| | - Timothy D Griffiths
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
| |
Collapse
|
3
|
Jeon MJ, Woo J. Effect of speech-stimulus degradation on phoneme-related potential. PLoS One 2023; 18:e0287584. [PMID: 37352220 PMCID: PMC10289326 DOI: 10.1371/journal.pone.0287584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 06/08/2023] [Indexed: 06/25/2023] Open
Abstract
Auditory evoked potential (AEP) has been used to evaluate the degree of hearing and speech cognition. Because AEP generates a very small voltage relative to ambient noise, a repetitive presentation of a stimulus, such as a tone, word, or short sentence, should be employed to generate ensemble averages over trials. However, the stimulation of repetitive short words and sentences may present an unnatural situation to a subject. Phoneme-related potentials (PRPs), which are evoked-responses to typical phonemic stimuli, can be extracted from electroencephalography (EEG) data in response to a continuous storybook. In this study, we investigated the effects of spectrally degraded speech stimuli on PRPs. The EEG data in response to the spectrally degraded and natural storybooks were recorded from normal listeners, and the PRP components for 10 vowels and 12 consonants were extracted. The PRP responses to a vocoded (spectrally-degraded) storybook showed a statistically significant lower peak amplitude and were prolonged compared with those of a natural storybook. The findings in this study suggest that PRPs can be considered a potential tool to evaluate hearing and speech cognition as other AEPs. Moreover, PRPs can provide the details of phonological processing and phonemic awareness to understand poor speech intelligibility. Further investigation with the hearing impaired is required prior to clinical application.
Collapse
Affiliation(s)
- Min-Jae Jeon
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, Republic of Korea
| | - Jihwan Woo
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, Republic of Korea
- Department of Biomedical Engineering, University of Ulsan, Ulsan, Republic of Korea
| |
Collapse
|
4
|
Dolhopiatenko H, Nogueira W. Selective attention decoding in bimodal cochlear implant users. Front Neurosci 2023; 16:1057605. [PMID: 36711138 PMCID: PMC9874229 DOI: 10.3389/fnins.2022.1057605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 12/20/2022] [Indexed: 01/12/2023] Open
Abstract
The growing group of cochlear implant (CI) users includes subjects with preserved acoustic hearing on the opposite side to the CI. The use of both listening sides results in improved speech perception in comparison to listening with one side alone. However, large variability in the measured benefit is observed. It is possible that this variability is associated with the integration of speech across electric and acoustic stimulation modalities. However, there is a lack of established methods to assess speech integration between electric and acoustic stimulation and consequently to adequately program the devices. Moreover, existing methods do not provide information about the underlying physiological mechanisms of this integration or are based on simple stimuli that are difficult to relate to speech integration. Electroencephalography (EEG) to continuous speech is promising as an objective measure of speech perception, however, its application in CIs is challenging because it is influenced by the electrical artifact introduced by these devices. For this reason, the main goal of this work is to investigate a possible electrophysiological measure of speech integration between electric and acoustic stimulation in bimodal CI users. For this purpose, a selective attention decoding paradigm has been designed and validated in bimodal CI users. The current study included behavioral and electrophysiological measures. The behavioral measure consisted of a speech understanding test, where subjects repeated words to a target speaker in the presence of a competing voice listening with the CI side (CIS) only, with the acoustic side (AS) only or with both listening sides (CIS+AS). Electrophysiological measures included cortical auditory evoked potentials (CAEPs) and selective attention decoding through EEG. CAEPs were recorded to broadband stimuli to confirm the feasibility to record cortical responses with CIS only, AS only, and CIS+AS listening modes. In the selective attention decoding paradigm a co-located target and a competing speech stream were presented to the subjects using the three listening modes (CIS only, AS only, and CIS+AS). The main hypothesis of the current study is that selective attention can be decoded in CI users despite the presence of CI electrical artifact. If selective attention decoding improves combining electric and acoustic stimulation with respect to electric stimulation alone, the hypothesis can be confirmed. No significant difference in behavioral speech understanding performance when listening with CIS+AS and AS only was found, mainly due to the ceiling effect observed with these two listening modes. The main finding of the current study is the possibility to decode selective attention in CI users even if continuous artifact is present. Moreover, an amplitude reduction of the forward transfer response function (TRF) of selective attention decoding was observed when listening with CIS+AS compared to AS only. Further studies to validate selective attention decoding as an electrophysiological measure of electric acoustic speech integration are required.
Collapse
|
5
|
Shim H, Kim S, Hong J, Na Y, Woo J, Hansen M, Gantz B, Choi I. Differences in neural encoding of speech in noise between cochlear implant users with and without preserved acoustic hearing. Hear Res 2023; 427:108649. [PMID: 36462377 PMCID: PMC9842477 DOI: 10.1016/j.heares.2022.108649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 11/06/2022] [Accepted: 11/12/2022] [Indexed: 11/15/2022]
Abstract
Cochlear implants (CIs) have evolved to combine residual acoustic hearing with electric hearing. It has been expected that CI users with residual acoustic hearing experience better speech-in-noise perception than CI-only listeners because preserved acoustic cues aid unmasking speech from background noise. This study sought neural substrate of better speech unmasking in CI users with preserved acoustic hearing compared to those with lower degree of acoustic hearing. Cortical evoked responses to speech in multi-talker babble noise were compared between 29 Hybrid (i.e., electric acoustic stimulation or EAS) and 29 electric-only CI users. The amplitude ratio of evoked responses to speech and noise, or internal SNR, was significantly larger in the CI users with EAS. This result indicates that CI users with better residual acoustic hearing exhibit enhanced unmasking of speech from background noise.
Collapse
Affiliation(s)
- Hwan Shim
- Dept. Electrical and Computer Engineering Technology, Rochester Institute of Technology, Rochester, NY 14623, United States
| | - Subong Kim
- Dept. Communication Sciences and Disorders, Montclair State University, Montclair, NJ 07043, United States
| | - Jean Hong
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA 52242, United States
| | - Youngmin Na
- Dept. Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States
| | - Jihwan Woo
- Dept. Biomedical Engineering, University of Ulsan, Ulsan, Republic of Korea
| | - Marlan Hansen
- Dept. Otolaryngology - Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States
| | - Bruce Gantz
- Dept. Otolaryngology - Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States
| | - Inyong Choi
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA 52242, United States; Dept. Otolaryngology - Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States.
| |
Collapse
|
6
|
Lee JH, Shim H, Gantz B, Choi I. Strength of Attentional Modulation on Cortical Auditory Evoked Responses Correlates with Speech-in-Noise Performance in Bimodal Cochlear Implant Users. Trends Hear 2022; 26:23312165221141143. [PMID: 36464791 PMCID: PMC9726851 DOI: 10.1177/23312165221141143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Auditory selective attention is a crucial top-down cognitive mechanism for understanding speech in noise. Cochlear implant (CI) users display great variability in speech-in-noise performance that is not easily explained by peripheral auditory profile or demographic factors. Thus, it is imperative to understand if auditory cognitive processes such as selective attention explain such variability. The presented study directly addressed this question by quantifying attentional modulation of cortical auditory responses during an attention task and comparing its individual differences with speech-in-noise performance. In our attention experiment, participants with CI were given a pre-stimulus visual cue that directed their attention to either of two speech streams and were asked to select a deviant syllable in the target stream. The two speech streams consisted of the female voice saying "Up" five times every 800 ms and the male voice saying "Down" four times every 1 s. The onset of each syllable elicited distinct event-related potentials (ERPs). At each syllable onset, the difference in the amplitudes of ERPs between the two attentional conditions (attended - ignored) was computed. This ERP amplitude difference served as a proxy for attentional modulation strength. Our group-level analysis showed that the amplitude of ERPs was greater when the syllable was attended than ignored, exhibiting that attention modulated cortical auditory responses. Moreover, the strength of attentional modulation showed a significant correlation with speech-in-noise performance. These results suggest that the attentional modulation of cortical auditory responses may provide a neural marker for predicting CI users' success in clinical tests of speech-in-noise listening.
Collapse
Affiliation(s)
- Jae-Hee Lee
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA, 52242, USA,Dept. Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Hwan Shim
- Dept. Electrical and Computer Engineering Technology, Rochester Institute of Technology, Rochester, NY, 14623, USA
| | - Bruce Gantz
- Dept. Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Inyong Choi
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA, 52242, USA,Dept. Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA,Inyong Choi, 250 Hawkins Dr., Iowa City, IA 52242, USA.
| |
Collapse
|
7
|
Matz AF, Nie Y, Wheeler HJ. Auditory stream segregation of amplitude-modulated narrowband noise in cochlear implant users and individuals with normal hearing. Front Psychol 2022; 13:927854. [PMID: 36118488 PMCID: PMC9479457 DOI: 10.3389/fpsyg.2022.927854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open
Abstract
Voluntary stream segregation was investigated in cochlear implant (CI) users and normal-hearing (NH) listeners using a segregation-promoting objective approach which evaluated the role of spectral and amplitude-modulation (AM) rate separations on stream segregation and its build-up. Sequences of 9 or 3 pairs of A and B narrowband noise (NBN) bursts were presented which differed in either center frequency of the noise band, the AM-rate, or both. In some sequences (delayed sequences), the last B burst was delayed by 35 ms from their otherwise-steady temporal position. In the other sequences (no-delay sequences), the last B bursts were temporally advanced from 0 to 10 ms. A single interval yes/no procedure was utilized to measure participants’ sensitivity (d′) in identifying delayed vs. no-delay sequences. A higher d′ value showed the higher ability to segregate the A and B subsequences. For NH listeners, performance improved with each spectral separation. However, for CI users, performance was only significantly better for the condition with the largest spectral separation. Additionally, performance was significantly poorer for the largest AM-rate separation than for the condition with no AM-rate separation for both groups. The significant effect of sequence duration in both groups indicated that listeners made more improvement with lengthening the duration of stimulus sequences, supporting the build-up effect. The results of this study suggest that CI users are less able than NH listeners to segregate NBN bursts into different auditory streams when they are moderately separated in the spectral domain. Contrary to our hypothesis, our results indicate that AM-rate separation may interfere with the segregation of streams of NBN. Additionally, our results add evidence to the literature that CI users build up stream segregation at a rate comparable to NH listeners, when the inter-stream spectral separations are adequately large.
Collapse
Affiliation(s)
- Alexandria F. Matz
- Department of Otolaryngology, Eastern Virginia Medical School, Norfolk, VA, United States
| | - Yingjiu Nie
- Department of Communication Sciences and Disorders, James Madison University, Harrisonburg, VA, United States
- *Correspondence: Yingjiu Nie,
| | - Harley J. Wheeler
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis, MN, United States
| |
Collapse
|
8
|
Na Y, Joo H, Trang LT, Quan LDA, Woo J. Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses. Front Neurosci 2022; 16:906616. [PMID: 36061597 PMCID: PMC9433707 DOI: 10.3389/fnins.2022.906616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 07/25/2022] [Indexed: 11/29/2022] Open
Abstract
Auditory prostheses provide an opportunity for rehabilitation of hearing-impaired patients. Speech intelligibility can be used to estimate the extent to which the auditory prosthesis improves the user’s speech comprehension. Although behavior-based speech intelligibility is the gold standard, precise evaluation is limited due to its subjectiveness. Here, we used a convolutional neural network to predict speech intelligibility from electroencephalography (EEG). Sixty-four–channel EEGs were recorded from 87 adult participants with normal hearing. Sentences spectrally degraded by a 2-, 3-, 4-, 5-, and 8-channel vocoder were used to set relatively low speech intelligibility conditions. A Korean sentence recognition test was used. The speech intelligibility scores were divided into 41 discrete levels ranging from 0 to 100%, with a step of 2.5%. Three scores, namely 30.0, 37.5, and 40.0%, were not collected. The speech features, i.e., the speech temporal envelope (ENV) and phoneme (PH) onset, were used to extract continuous-speech EEGs for speech intelligibility prediction. The deep learning model was trained by a dataset of event-related potentials (ERP), correlation coefficients between the ERPs and ENVs, between the ERPs and PH onset, or between ERPs and the product of the multiplication of PH and ENV (PHENV). The speech intelligibility prediction accuracies were 97.33% (ERP), 99.42% (ENV), 99.55% (PH), and 99.91% (PHENV). The models were interpreted using the occlusion sensitivity approach. While the ENV models’ informative electrodes were located in the occipital area, the informative electrodes of the phoneme models, i.e., PH and PHENV, were based on the occlusion sensitivity map located in the language processing area. Of the models tested, the PHENV model obtained the best speech intelligibility prediction accuracy. This model may promote clinical prediction of speech intelligibility with a comfort speech intelligibility test.
Collapse
Affiliation(s)
- Youngmin Na
- Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea
| | - Hyosung Joo
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
| | - Le Thi Trang
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
| | - Luong Do Anh Quan
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
| | - Jihwan Woo
- Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
- *Correspondence: Jihwan Woo,
| |
Collapse
|