1
|
Burleson AM, Souza PE. Cognitive and linguistic abilities and perceptual restoration of missing speech: Evidence from online assessment. Front Psychol 2022; 13:1059192. [PMID: 36571056 PMCID: PMC9773209 DOI: 10.3389/fpsyg.2022.1059192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 11/23/2022] [Indexed: 12/13/2022] Open
Abstract
When speech is clear, speech understanding is a relatively simple and automatic process. However, when the acoustic signal is degraded, top-down cognitive and linguistic abilities, such as working memory capacity, lexical knowledge (i.e., vocabulary), inhibitory control, and processing speed can often support speech understanding. This study examined whether listeners aged 22-63 (mean age 42 years) with better cognitive and linguistic abilities would be better able to perceptually restore missing speech information than those with poorer scores. Additionally, the role of context and everyday speech was investigated using high-context, low-context, and realistic speech corpi to explore these effects. Sixty-three adult participants with self-reported normal hearing completed a short cognitive and linguistic battery before listening to sentences interrupted by silent gaps or noise bursts. Results indicated that working memory was the most reliable predictor of perceptual restoration ability, followed by lexical knowledge, and inhibitory control and processing speed. Generally, silent gap conditions were related to and predicted by a broader range of cognitive abilities, whereas noise burst conditions were related to working memory capacity and inhibitory control. These findings suggest that higher-order cognitive and linguistic abilities facilitate the top-down restoration of missing speech information and contribute to individual variability in perceptual restoration.
Collapse
|
2
|
Moberly AC, Lewis JH, Vasil KJ, Ray C, Tamati TN. Bottom-Up Signal Quality Impacts the Role of Top-Down Cognitive-Linguistic Processing During Speech Recognition by Adults with Cochlear Implants. Otol Neurotol 2021; 42:S33-S41. [PMID: 34766942 PMCID: PMC8597903 DOI: 10.1097/mao.0000000000003377] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
HYPOTHESES Significant variability persists in speech recognition outcomes in adults with cochlear implants (CIs). Sensory ("bottom-up") and cognitive-linguistic ("top-down") processes help explain this variability. However, the interactions of these bottom-up and top-down factors remain unclear. One hypothesis was tested: top-down processes would contribute differentially to speech recognition, depending on the fidelity of bottom-up input. BACKGROUND Bottom-up spectro-temporal processing, assessed using a Spectral-Temporally Modulated Ripple Test (SMRT), is associated with CI speech recognition outcomes. Similarly, top-down cognitive-linguistic skills relate to outcomes, including working memory capacity, inhibition-concentration, speed of lexical access, and nonverbal reasoning. METHODS Fifty-one adult CI users were tested for word and sentence recognition, along with performance on the SMRT and a battery of cognitive-linguistic tests. The group was divided into "low-," "intermediate-," and "high-SMRT" groups, based on SMRT scores. Separate correlation analyses were performed for each subgroup between a composite score of cognitive-linguistic processing and speech recognition. RESULTS Associations of top-down composite scores with speech recognition were not significant for the low-SMRT group. In contrast, these associations were significant and of medium effect size (Spearman's rho = 0.44-0.46) for two sentence types for the intermediate-SMRT group. For the high-SMRT group, top-down scores were associated with both word and sentence recognition, with medium to large effect sizes (Spearman's rho = 0.45-0.58). CONCLUSIONS Top-down processes contribute differentially to speech recognition in CI users based on the quality of bottom-up input. Findings have clinical implications for individualized treatment approaches relying on bottom-up device programming or top-down rehabilitation approaches.
Collapse
Affiliation(s)
- Aaron C Moberly
- Department of Otolaryngology - Head & Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
| | - Jessica H Lewis
- Department of Otolaryngology - Head & Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
| | - Kara J Vasil
- Department of Otolaryngology - Head & Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
| | - Christin Ray
- Department of Otolaryngology - Head & Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
| | - Terrin N Tamati
- Department of Otolaryngology - Head & Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| |
Collapse
|
3
|
Tamati TN, Ray C, Vasil KJ, Pisoni DB, Moberly AC. High- and Low-Performing Adult Cochlear Implant Users on High-Variability Sentence Recognition: Differences in Auditory Spectral Resolution and Neurocognitive Functioning. J Am Acad Audiol 2020; 31:324-335. [PMID: 31580802 DOI: 10.3766/jaaa.18106] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
BACKGROUND Postlingually deafened adult cochlear implant (CI) users routinely display large individual differences in the ability to recognize and understand speech, especially in adverse listening conditions. Although individual differences have been linked to several sensory (''bottom-up'') and cognitive (''top-down'') factors, little is currently known about the relative contributions of these factors in high- and low-performing CI users. PURPOSE The aim of the study was to investigate differences in sensory functioning and neurocognitive functioning between high- and low-performing CI users on the Perceptually Robust English Sentence Test Open-set (PRESTO), a high-variability sentence recognition test containing sentence materials produced by multiple male and female talkers with diverse regional accents. RESEARCH DESIGN CI users with accuracy scores in the upper (HiPRESTO) or lower quartiles (LoPRESTO) on PRESTO in quiet completed a battery of behavioral tasks designed to assess spectral resolution and neurocognitive functioning. STUDY SAMPLE Twenty-one postlingually deafened adult CI users, with 11 HiPRESTO and 10 LoPRESTO participants. DATA COLLECTION AND ANALYSIS A discriminant analysis was carried out to determine the extent to which measures of spectral resolution and neurocognitive functioning discriminate HiPRESTO and LoPRESTO CI users. Auditory spectral resolution was measured using the Spectral-Temporally Modulated Ripple Test (SMRT). Neurocognitive functioning was assessed with visual measures of working memory (digit span), inhibitory control (Stroop), speed of lexical/phonological access (Test of Word Reading Efficiency), and nonverbal reasoning (Raven's Progressive Matrices). RESULTS HiPRESTO and LoPRESTO CI users were discriminated primarily by performance on the SMRT and secondarily by the Raven's test. No other neurocognitive measures contributed substantially to the discriminant function. CONCLUSIONS High- and low-performing CI users differed by spectral resolution and, to a lesser extent, nonverbal reasoning. These findings suggest that the extreme groups are determined by global factors of richness of sensory information and domain-general, nonverbal intelligence, rather than specific neurocognitive processing operations related to speech perception and spoken word recognition. Thus, although both bottom-up and top-down information contribute to speech recognition performance, low-performing CI users may not be sufficiently able to rely on neurocognitive skills specific to speech recognition to enhance processing of spectrally degraded input in adverse conditions involving high talker variability.
Collapse
Affiliation(s)
- Terrin N Tamati
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.,Department of Otolaryngology - Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH
| | - Christin Ray
- Department of Otolaryngology - Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH
| | - Kara J Vasil
- Department of Otolaryngology - Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH
| | - David B Pisoni
- Department of Psychological and Brain Sciences, Indiana University - Bloomington, Bloomington, IN
| | - Aaron C Moberly
- Department of Otolaryngology - Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH
| |
Collapse
|
4
|
Pals C, Sarampalis A, Beynon A, Stainsby T, Başkent D. Effect of Spectral Channels on Speech Recognition, Comprehension, and Listening Effort in Cochlear-Implant Users. Trends Hear 2020; 24:2331216520904617. [PMID: 32189585 PMCID: PMC7082863 DOI: 10.1177/2331216520904617] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In favorable listening conditions, cochlear-implant (CI) users can reach high
speech recognition scores with as little as seven active electrodes. Here, we
hypothesized that even when speech recognition is high, additional spectral
channels may still benefit other aspects of speech perception, such as
comprehension and listening effort. Twenty-five adult, postlingually deafened CI
users, selected from two Dutch implant centers for high clinical word
identification scores, participated in two experiments. Experimental conditions
were created by varying the number of active electrodes of the CIs between 7 and
15. In Experiment 1, response times (RTs) on the secondary task in a dual-task
paradigm were used as an indirect measure of listening effort, and in Experiment
2, sentence verification task (SVT) accuracy and RTs were used to measure speech
comprehension and listening effort, respectively. Speech recognition was near
ceiling for all conditions tested, as intended by the design. However, the
dual-task paradigm failed to show the hypothesized decrease in RTs with
increasing spectral channels. The SVT did show a systematic improvement in both
speech comprehension and response speed across all conditions. In conclusion,
the SVT revealed additional benefits in both speech comprehension and listening
effort for conditions in which high speech recognition was already achieved.
Hence, adding spectral channels may provide benefits for CI listeners that may
not be reflected by traditional speech tests. The SVT is a relatively simple
task that is easy to implement and may therefore be a good candidate for
identifying such additional benefits in research or clinical settings.
Collapse
Affiliation(s)
- Carina Pals
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| | | | - Andy Beynon
- Department of Otorhinolaryngology, Head and Neck Surgery, Hearing and Implants, Radboud University Medical Centre, Nijmegen, the Netherlands
| | | | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| |
Collapse
|
5
|
Patro C, Mendel LL. Semantic influences on the perception of degraded speech by individuals with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1778. [PMID: 32237796 DOI: 10.1121/10.0000934] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Accepted: 03/02/2020] [Indexed: 06/11/2023]
Abstract
This study investigated whether speech intelligibility in cochlear implant (CI) users is affected by semantic context. Three groups participated in two experiments: Two groups of listeners with normal hearing (NH) listened to either full spectrum speech or vocoded speech, and one CI group listened to full spectrum speech. Experiment 1 measured participants' sentence recognition as a function of target-to-masker ratio (four-talker babble masker), and experiment 2 measured perception of interrupted speech as a function of duty cycles (long/short uninterrupted speech). Listeners were presented with both semantic congruent/incongruent targets. Results from the two experiments suggested that NH listeners benefitted more from the semantic cues as the listening conditions became more challenging (lower signal-to-noise ratios and interrupted speech with longer silent intervals). However, the CI group received minimal benefit from context, and therefore performed poorly in such conditions. On the contrary, in the conditions that were less challenging, CI users benefitted greatly from the semantic context, and NH listeners did not rely on such cues. The results also confirmed that such differential use of semantic cues appears to originate from the spectro-temporal degradations experienced by CI users, which could be a contributing factor for their poor performance in suboptimal environments.
Collapse
Affiliation(s)
- Chhayakanta Patro
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55414, USA
| | - Lisa Lucks Mendel
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee 38152, USA
| |
Collapse
|
6
|
Calandruccio L, Wasiuk PA, Buss E, Leibold LJ, Kong J, Holmes A, Oleson J. The effect of target/masker fundamental frequency contour similarity on masked-speech recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:1065. [PMID: 31472562 PMCID: PMC6690832 DOI: 10.1121/1.5121314] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 07/19/2019] [Accepted: 07/23/2019] [Indexed: 05/20/2023]
Abstract
Greater informational masking is observed when the target and masker speech are more perceptually similar. Fundamental frequency (f0) contour, or the dynamic movement of f0, is thought to provide cues for segregating target speech presented in a speech masker. Most of the data demonstrating this effect have been collected using digitally modified stimuli. Less work has been done exploring the role of f0 contour for speech-in-speech recognition when all of the stimuli have been produced naturally. The goal of this project was to explore the importance of target and masker f0 contour similarity by manipulating the speaking style of talkers producing the target and masker speech streams. Sentence recognition thresholds were evaluated for target and masker speech that was produced with either flat, normal, or exaggerated speaking styles; performance was also measured in speech spectrum shaped noise and for conditions in which the stimuli were processed through an ideal-binary mask. Results confirmed that similarities in f0 contour depth elevated speech-in-speech recognition thresholds; however, when the target and masker had similar contour depths, targets with normal f0 contours were more resistant to masking than targets with flat or exaggerated contours. Differences in energetic masking across stimuli cannot account for these results.
Collapse
Affiliation(s)
- Lauren Calandruccio
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Peter A Wasiuk
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Lori J Leibold
- Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Jessica Kong
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Ann Holmes
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Jacob Oleson
- Department of Biostatistics, University of Iowa, Iowa City, Iowa 52246, USA
| |
Collapse
|
7
|
Patro C, Mendel LL. Gated Word Recognition by Postlingually Deafened Adults With Cochlear Implants: Influence of Semantic Context. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:145-158. [PMID: 29242894 DOI: 10.1044/2017_jslhr-h-17-0141] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Accepted: 08/28/2017] [Indexed: 06/07/2023]
Abstract
PURPOSE The main goal of this study was to investigate the minimum amount of sensory information required to recognize spoken words (isolation points [IPs]) in listeners with cochlear implants (CIs) and investigate facilitative effects of semantic contexts on the IPs. METHOD Listeners with CIs as well as those with normal hearing (NH) participated in the study. In Experiment 1, the CI users listened to unprocessed (full-spectrum) stimuli and individuals with NH listened to full-spectrum or vocoder processed speech. IPs were determined for both groups who listened to gated consonant-nucleus-consonant words that were selected based on lexical properties. In Experiment 2, the role of semantic context on IPs was evaluated. Target stimuli were chosen from the Revised Speech Perception in Noise corpus based on the lexical properties of the final words. RESULTS The results indicated that spectrotemporal degradations impacted IPs for gated words adversely, and CI users as well as participants with NH listening to vocoded speech had longer IPs than participants with NH who listened to full-spectrum speech. In addition, there was a clear disadvantage due to lack of semantic context in all groups regardless of the spectral composition of the target speech (full spectrum or vocoded). Finally, we showed that CI users (and users with NH with vocoded speech) can overcome such word processing difficulties with the help of semantic context and perform as well as listeners with NH. CONCLUSION Word recognition occurs even before the entire word is heard because listeners with NH associate an acoustic input with its mental representation to understand speech. The results of this study provide insight into the role of spectral degradation on the processing of spoken words in isolation and the potential benefits of semantic context. These results may also explain why CI users rely substantially on semantic context.
Collapse
Affiliation(s)
| | - Lisa Lucks Mendel
- School of Communication Sciences & Disorders, University of Memphis, TN
| |
Collapse
|
8
|
Top-Down Processes in Simulated Electric-Acoustic Hearing: The Effect of Linguistic Context on Bimodal Benefit for Temporally Interrupted Speech. Ear Hear 2018; 37:582-92. [PMID: 27007220 DOI: 10.1097/aud.0000000000000298] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
OBJECTIVES Previous studies have documented the benefits of bimodal hearing as compared with a cochlear implant alone, but most have focused on the importance of bottom-up, low-frequency cues. The purpose of the present study was to evaluate the role of top-down processing in bimodal hearing by measuring the effect of sentence context on bimodal benefit for temporally interrupted sentences. It was hypothesized that low-frequency acoustic cues would facilitate the use of contextual information in the interrupted sentences, resulting in greater bimodal benefit for the higher context (CUNY) sentences than for the lower context (IEEE) sentences. DESIGN Young normal-hearing listeners were tested in simulated bimodal listening conditions in which noise band vocoded sentences were presented to one ear with or without low-pass (LP) filtered speech or LP harmonic complexes (LPHCs) presented to the contralateral ear. Speech recognition scores were measured in three listening conditions: vocoder-alone, vocoder combined with LP speech, and vocoder combined with LPHCs. Temporally interrupted versions of the CUNY and IEEE sentences were used to assess listeners' ability to fill in missing segments of speech by using top-down linguistic processing. Sentences were square-wave gated at a rate of 5 Hz with a 50% duty cycle. Three vocoder channel conditions were tested for each type of sentence (8, 12, and 16 channels for CUNY; 12, 16, and 32 channels for IEEE) and bimodal benefit was compared for similar amounts of spectral degradation (matched-channel comparisons) and similar ranges of baseline performance. Two gain measures, percentage-point gain and normalized gain, were examined. RESULTS Significant effects of context on bimodal benefit were observed when LP speech was presented to the residual-hearing ear. For the matched-channel comparisons, CUNY sentences showed significantly higher normalized gains than IEEE sentences for both the 12-channel (20 points higher) and 16-channel (18 points higher) conditions. For the individual gain comparisons that used a similar range of baseline performance, CUNY sentences showed bimodal benefits that were significantly higher (7% points, or 15 points normalized gain) than those for IEEE sentences. The bimodal benefits observed here for temporally interrupted speech were considerably smaller than those observed in an earlier study that used continuous speech. Furthermore, unlike previous findings for continuous speech, no bimodal benefit was observed when LPHCs were presented to the LP ear. CONCLUSIONS Findings indicate that linguistic context has a significant influence on bimodal benefit for temporally interrupted speech and support the hypothesis that low-frequency acoustic information presented to the residual-hearing ear facilitates the use of top-down linguistic processing in bimodal hearing. However, bimodal benefit is reduced for temporally interrupted speech as compared with continuous speech, suggesting that listeners' ability to restore missing speech information depends not only on top-down linguistic knowledge but also on the quality of the bottom-up sensory input.
Collapse
|
9
|
Nagaraj NK, Magimairaj BM. Role of working memory and lexical knowledge in perceptual restoration of interrupted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:3756. [PMID: 29289104 DOI: 10.1121/1.5018429] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The role of working memory (WM) capacity and lexical knowledge in perceptual restoration (PR) of missing speech was investigated using the interrupted speech perception paradigm. Speech identification ability, which indexed PR, was measured using low-context sentences periodically interrupted at 1.5 Hz. PR was measured for silent gated, low-frequency speech noise filled, and low-frequency fine-structure and envelope filled interrupted conditions. WM capacity was measured using verbal and visuospatial span tasks. Lexical knowledge was assessed using both receptive vocabulary and meaning from context tests. Results showed that PR was better for speech noise filled condition than other conditions tested. Both receptive vocabulary and verbal WM capacity explained unique variance in PR for the speech noise filled condition, but were unrelated to performance in the silent gated condition. It was only receptive vocabulary that uniquely predicted PR for fine-structure and envelope filled conditions. These findings suggest that the contribution of lexical knowledge and verbal WM during PR depends crucially on the information content that replaced the silent intervals. When perceptual continuity was partially restored by filler speech noise, both lexical knowledge and verbal WM capacity facilitated PR. Importantly, for fine-structure and envelope filled interrupted conditions, lexical knowledge was crucial for PR.
Collapse
Affiliation(s)
- Naveen K Nagaraj
- Cognitive Hearing Science Lab, University of Arkansas for Medical Sciences and University of Arkansas at Little Rock, Little Rock, Arkansas 72204, USA
| | - Beula M Magimairaj
- Cognition and Language Lab, Communication Sciences and Disorders, University of Central Arkansas, Conway, Arkansas 72035, USA
| |
Collapse
|
10
|
Clarke J, Kazanoğlu D, Başkent D, Gaudrain E. Effect of F0 contours on top-down repair of interrupted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:EL7. [PMID: 28764445 DOI: 10.1121/1.4990398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Top-down repair of interrupted speech can be influenced by bottom-up acoustic cues such as voice pitch (F0). This study aims to investigate the role of the dynamic information of pitch, i.e., F0 contours, in top-down repair of speech. Intelligibility of sentences interrupted with silence or noise was measured in five F0 contour conditions (inverted, flat, original, exaggerated with a factor of 1.5 and 1.75). The main hypothesis was that manipulating F0 contours would impair linking successive segments of interrupted speech and thus negatively affect top-down repair. Intelligibility of interrupted speech was impaired only by misleading dynamic information (inverted F0 contours). The top-down repair of interrupted speech was not affected by any F0 contours manipulation.
Collapse
Affiliation(s)
- Jeanne Clarke
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, P.O. Box 30.001, BB21, 9700 RB Groningen, The Netherlands , , ,
| | - Deniz Kazanoğlu
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, P.O. Box 30.001, BB21, 9700 RB Groningen, The Netherlands , , ,
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, P.O. Box 30.001, BB21, 9700 RB Groningen, The Netherlands , , ,
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, P.O. Box 30.001, BB21, 9700 RB Groningen, The Netherlands , , ,
| |
Collapse
|
11
|
Stilp C, Donaldson G, Oh S, Kong YY. Influences of noise-interruption and information-bearing acoustic changes on understanding simulated electric-acoustic speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:3971. [PMID: 27908030 PMCID: PMC6909990 DOI: 10.1121/1.4967445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Revised: 08/18/2016] [Accepted: 10/28/2016] [Indexed: 06/06/2023]
Abstract
In simulations of electrical-acoustic stimulation (EAS), vocoded speech intelligibility is aided by preservation of low-frequency acoustic cues. However, the speech signal is often interrupted in everyday listening conditions, and effects of interruption on hybrid speech intelligibility are poorly understood. Additionally, listeners rely on information-bearing acoustic changes to understand full-spectrum speech (as measured by cochlea-scaled entropy [CSE]) and vocoded speech (CSECI), but how listeners utilize these informational changes to understand EAS speech is unclear. Here, normal-hearing participants heard noise-vocoded sentences with three to six spectral channels in two conditions: vocoder-only (80-8000 Hz) and simulated hybrid EAS (vocoded above 500 Hz; original acoustic signal below 500 Hz). In each sentence, four 80-ms intervals containing high-CSECI or low-CSECI acoustic changes were replaced with speech-shaped noise. As expected, performance improved with the preservation of low-frequency fine-structure cues (EAS). This improvement decreased for continuous EAS sentences as more spectral channels were added, but increased as more channels were added to noise-interrupted EAS sentences. Performance was impaired more when high-CSECI intervals were replaced by noise than when low-CSECI intervals were replaced, but this pattern did not differ across listening modes. Utilizing information-bearing acoustic changes to understand speech is predicted to generalize to cochlear implant users who receive EAS inputs.
Collapse
Affiliation(s)
- Christian Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Gail Donaldson
- Department of Communication Sciences and Disorders, University of South Florida, PCD 1017, 4202 East Fowler Avenue, Tampa, Florida 33620, USA
| | - Soohee Oh
- Department of Communication Sciences and Disorders, University of South Florida, PCD 1017, 4202 East Fowler Avenue, Tampa, Florida 33620, USA
| | - Ying-Yee Kong
- Department of Communication Sciences and Disorders, Northeastern University, 226 Forsyth Building, 360 Huntington Avenue, Boston, Massachusetts 02115, USA
| |
Collapse
|
12
|
Başkent D, Clarke J, Pals C, Benard MR, Bhargava P, Saija J, Sarampalis A, Wagner A, Gaudrain E. Cognitive Compensation of Speech Perception With Hearing Impairment, Cochlear Implants, and Aging. Trends Hear 2016. [PMCID: PMC5056620 DOI: 10.1177/2331216516670279] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
External degradations in incoming speech reduce understanding, and hearing impairment further compounds the problem. While cognitive mechanisms alleviate some of the difficulties, their effectiveness may change with age. In our research, reviewed here, we investigated cognitive compensation with hearing impairment, cochlear implants, and aging, via (a) phonemic restoration as a measure of top-down filling of missing speech, (b) listening effort and response times as a measure of increased cognitive processing, and (c) visual world paradigm and eye gazing as a measure of the use of context and its time course. Our results indicate that between speech degradations and their cognitive compensation, there is a fine balance that seems to vary greatly across individuals. Hearing impairment or inadequate hearing device settings may limit compensation benefits. Cochlear implants seem to allow the effective use of sentential context, but likely at the cost of delayed processing. Linguistic and lexical knowledge, which play an important role in compensation, may be successfully employed in advanced age, as some compensatory mechanisms seem to be preserved. These findings indicate that cognitive compensation in hearing impairment can be highly complicated—not always absent, but also not easily predicted by speech intelligibility tests only.
Collapse
Affiliation(s)
- Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Graduate School of Medical Sciences, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Jeanne Clarke
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Graduate School of Medical Sciences, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Carina Pals
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Graduate School of Medical Sciences, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Michel R. Benard
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Pento Speech and Hearing Center Zwolle, Zwolle, Netherlands
| | - Pranesh Bhargava
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Graduate School of Medical Sciences, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Jefta Saija
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Graduate School of Medical Sciences, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Anastasios Sarampalis
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
- Department of Psychology, University of Groningen, Netherlands
| | - Anita Wagner
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Graduate School of Medical Sciences, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Graduate School of Medical Sciences, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
- Auditory Cognition and Psychoacoustics, CNRS, Lyon Neuroscience Research Center, Lyon, France
| |
Collapse
|
13
|
Patro C, Mendel LL. Role of contextual cues on the perception of spectrally reduced interrupted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:1336. [PMID: 27586760 DOI: 10.1121/1.4961450] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Understanding speech within an auditory scene is constantly challenged by interfering noise in suboptimal listening environments when noise hinders the continuity of the speech stream. In such instances, a typical auditory-cognitive system perceptually integrates available speech information and "fills in" missing information in the light of semantic context. However, individuals with cochlear implants (CIs) find it difficult and effortful to understand interrupted speech compared to their normal hearing counterparts. This inefficiency in perceptual integration of speech could be attributed to further degradations in the spectral-temporal domain imposed by CIs making it difficult to utilize the contextual evidence effectively. To address these issues, 20 normal hearing adults listened to speech that was spectrally reduced and spectrally reduced interrupted in a manner similar to CI processing. The Revised Speech Perception in Noise test, which includes contextually rich and contextually poor sentences, was used to evaluate the influence of semantic context on speech perception. Results indicated that listeners benefited more from semantic context when they listened to spectrally reduced speech alone. For the spectrally reduced interrupted speech, contextual information was not as helpful under significant spectral reductions, but became beneficial as the spectral resolution improved. These results suggest top-down processing facilitates speech perception up to a point, and it fails to facilitate speech understanding when the speech signals are significantly degraded.
Collapse
Affiliation(s)
- Chhayakanta Patro
- School of Communication Sciences and Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee, 38152, USA
| | - Lisa Lucks Mendel
- School of Communication Sciences and Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee, 38152, USA
| |
Collapse
|
14
|
The Intelligibility of Interrupted Speech: Cochlear Implant Users and Normal Hearing Listeners. J Assoc Res Otolaryngol 2016; 17:475-91. [PMID: 27090115 PMCID: PMC5023536 DOI: 10.1007/s10162-016-0565-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Accepted: 03/18/2016] [Indexed: 11/13/2022] Open
Abstract
Compared with normal-hearing listeners, cochlear implant (CI) users display a loss of intelligibility of speech interrupted by silence or noise, possibly due to reduced ability to integrate and restore speech glimpses across silence or noise intervals. The present study was conducted to establish the extent of the deficit typical CI users have in understanding interrupted high-context sentences as a function of a range of interruption rates (1.5 to 24 Hz) and duty cycles (50 and 75 %). Further, factors such as reduced signal quality of CI signal transmission and advanced age, as well as potentially lower speech intelligibility of CI users even in the lack of interruption manipulation, were explored by presenting young, as well as age-matched, normal-hearing (NH) listeners with full-spectrum and vocoded speech (eight-channel and speech intelligibility baseline performance matched). While the actual CI users had more difficulties in understanding interrupted speech and taking advantage of faster interruption rates and increased duty cycle than the eight-channel noise-band vocoded listeners, their performance was similar to the matched noise-band vocoded listeners. These results suggest that while loss of spectro-temporal resolution indeed plays an important role in reduced intelligibility of interrupted speech, these factors alone cannot entirely explain the deficit. Other factors associated with real CIs, such as aging or failure in transmission of essential speech cues, seem to additionally contribute to poor intelligibility of interrupted speech.
Collapse
|
15
|
The effect of visual cues on top-down restoration of temporally interrupted speech, with and without further degradations. Hear Res 2015; 328:24-33. [PMID: 26117407 DOI: 10.1016/j.heares.2015.06.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Revised: 06/15/2015] [Accepted: 06/22/2015] [Indexed: 11/21/2022]
Abstract
In complex listening situations, cognitive restoration mechanisms are commonly used to enhance perception of degraded speech with inaudible segments. Profoundly hearing-impaired people with a cochlear implant (CI) show less benefit from such mechanisms. However, both normal hearing (NH) listeners and CI users do benefit from visual speech cues in these listening situations. In this study we investigated if an accompanying video of the speaker can enhance the intelligibility of interrupted sentences and the phonemic restoration benefit, measured by an increase in intelligibility when the silent intervals are filled with noise. Similar to previous studies, restoration benefit was observed with interrupted speech without spectral degradations (Experiment 1), but was absent in acoustic simulations of CIs (Experiment 2) and was present again in simulations of electric-acoustic stimulation (Experiment 3). In all experiments, the additional speech information provided by the complementary visual cues lead to overall higher intelligibility, however, these cues did not influence the occurrence or extent of the phonemic restoration benefit of filler noise. Results imply that visual cues do not show a synergistic effect with the filler noise, as adding them equally increased the intelligibility of interrupted sentences with or without the filler noise.
Collapse
|
16
|
Effects of the simultaneous application of nonlinear frequency compression and dichotic hearing on the speech recognition of severely hearing-impaired subjects: simulation test. Clin Exp Otorhinolaryngol 2015; 8:102-10. [PMID: 26045907 PMCID: PMC4451533 DOI: 10.3342/ceo.2015.8.2.102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Revised: 02/20/2014] [Accepted: 03/29/2014] [Indexed: 11/08/2022] Open
Abstract
OBJECTIVES The clinical effects of the simultaneous application of nonlinear frequency compression and dichotic hearing on people with hearing impairments have not been evaluated previously. In this study, the clinical effects of the simultaneous application of these two techniques on the recognition of consonant-vowel-consonant (CVC) words with fricatives were evaluated using normal-hearing subjects and a hearing loss simulator operated in the severe hearing loss setting. METHODS A total of 21 normal-hearing volunteers whose native language was English were recruited for this study, and two different hearing loss simulators, which were configured for severe hearing loss in the high-frequency range, were utilized. The subjects heard 82 English CVC words, and the word recognition score and response time were measured. RESULTS The experimental results demonstrated that the simultaneous application of these two techniques showed almost even performance compared to the sole application of nonlinear frequency compression in a severe hearing loss setting. CONCLUSION Though it is generally accepted that dichotic hearing can decrease the spectral masking thresholds of an hearing-impaired person, simultaneous application of the nonlinear frequency compression and dichotic hearing techniques did not significantly improve the recognition of words with fricatives compared to the sole application of nonlinear frequency compression in a severe hearing loss setting.
Collapse
|
17
|
Perry TT, Kwon BJ. Amplitude fluctuations in a masker influence lexical segmentation in cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:2070-2079. [PMID: 25920857 PMCID: PMC4417024 DOI: 10.1121/1.4916698] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Revised: 03/12/2015] [Accepted: 03/18/2015] [Indexed: 06/01/2023]
Abstract
Normal-hearing listeners show masking release, or better speech understanding in a fluctuating-amplitude masker than in a steady-amplitude masker, but most cochlear implant (CI) users consistently show little or no masking release even in artificial conditions where masking release is highly anticipated. The current study examined the hypothesis that the reduced or absent masking release in CI users is due to disruption of linguistic segmentation cues. Eleven CI subjects completed a sentence keyword identification task in a steady masker and a fluctuating masker with dips timed to increase speech availability. Lexical boundary errors in their responses were categorized as consistent or inconsistent with the use of the metrical segmentation strategy (MSS). Subjects who demonstrated masking release showed greater adherence to the MSS in the fluctuating masker compared to subjects who showed little or no masking release, while both groups used metrical segmentation cues similarly in the steady masker. Based on the characteristics of the segmentation cues, the results are interpreted as evidence that CI listeners showing little or no masking release are not reliably segregating speech from competing sounds, further suggesting that one challenge faced by CI users listening in noisy environments is a reduction of reliable segmentation cues.
Collapse
Affiliation(s)
- Trevor T Perry
- Department of Hearing, Speech, and Language Sciences, Gallaudet University, Washington, DC 20002
| | - Bomjun J Kwon
- Department of Hearing, Speech, and Language Sciences, Gallaudet University, Washington, DC 20002
| |
Collapse
|
18
|
Stilp CE, Goupell MJ. Spectral and temporal resolutions of information-bearing acoustic changes for understanding vocoded sentences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:844-55. [PMID: 25698018 PMCID: PMC4336249 DOI: 10.1121/1.4906179] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Revised: 12/12/2014] [Accepted: 12/27/2014] [Indexed: 06/04/2023]
Abstract
Short-time spectral changes in the speech signal are important for understanding noise-vocoded sentences. These information-bearing acoustic changes, measured using cochlea-scaled entropy in cochlear implant simulations [CSECI; Stilp et al. (2013). J. Acoust. Soc. Am. 133(2), EL136-EL141; Stilp (2014). J. Acoust. Soc. Am. 135(3), 1518-1529], may offer better understanding of speech perception by cochlear implant (CI) users. However, perceptual importance of CSECI for normal-hearing listeners was tested at only one spectral resolution and one temporal resolution, limiting generalizability of results to CI users. Here, experiments investigated the importance of these informational changes for understanding noise-vocoded sentences at different spectral resolutions (4-24 spectral channels; Experiment 1), temporal resolutions (4-64 Hz cutoff for low-pass filters that extracted amplitude envelopes; Experiment 2), or when both parameters varied (6-12 channels, 8-32 Hz; Experiment 3). Sentence intelligibility was reduced more by replacing high-CSECI intervals with noise than replacing low-CSECI intervals, but only when sentences had sufficient spectral and/or temporal resolution. High-CSECI intervals were more important for speech understanding as spectral resolution worsened and temporal resolution improved. Trade-offs between CSECI and intermediate spectral and temporal resolutions were minimal. These results suggest that signal processing strategies that emphasize information-bearing acoustic changes in speech may improve speech perception for CI users.
Collapse
Affiliation(s)
- Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742
| |
Collapse
|
19
|
Ardoint M, Green T, Rosen S. The intelligibility of interrupted speech depends upon its uninterrupted intelligibility. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:EL275-EL280. [PMID: 25324110 DOI: 10.1121/1.4895096] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Recognition of sentences containing periodic, 5-Hz, silent interruptions of differing duty cycles was assessed for three types of processed speech. Processing conditions employed different combinations of spectral resolution and the availability of fundamental frequency (F0) information, chosen to yield similar, below-ceiling performance for uninterrupted speech. Performance declined with decreasing duty cycle similarly for each processing condition, suggesting that, at least for certain forms of speech processing and interruption rates, performance with interrupted speech may reflect that obtained with uninterrupted speech. This highlights the difficulty in interpreting differences in interrupted speech performance across conditions for which uninterrupted performance is at ceiling.
Collapse
Affiliation(s)
- Marine Ardoint
- Speech Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom , ,
| | - Tim Green
- Speech Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom , ,
| | - Stuart Rosen
- Speech Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom , ,
| |
Collapse
|
20
|
Benard MR, Başkent D. Perceptual learning of temporally interrupted spectrally degraded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:1344. [PMID: 25190407 DOI: 10.1121/1.4892756] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Normal-hearing (NH) listeners make use of context, speech redundancy and top-down linguistic processes to perceptually restore inaudible or masked portions of speech. Previous research has shown poorer perception and restoration of interrupted speech in CI users and NH listeners tested with acoustic simulations of CIs. Three hypotheses were investigated: (1) training with CI simulations of interrupted sentences can teach listeners to use the high-level restoration mechanisms more effectively, (2) phonemic restoration benefit, an increase in intelligibility of interrupted sentences once its silent gaps are filled with noise, can be induced with training, and (3) perceptual learning of interrupted sentences can be reflected in clinical speech audiometry. To test these hypotheses, NH listeners were trained using periodically interrupted sentences, also spectrally degraded with a noiseband vocoder as CI simulation. Feedback was presented by displaying the sentence text and playing back both the intact and the interrupted CI simulation of the sentence. Training induced no phonemic restoration benefit, and learning was not transferred to speech audiometry measured with words. However, a significant improvement was observed in overall intelligibility of interrupted spectrally degraded sentences, with or without filler noise, suggesting possibly better use of restoration mechanisms as a result of training.
Collapse
Affiliation(s)
- Michel Ruben Benard
- Pento Audiology Center Zwolle, Oosterlaan 20, 8011 GC Zwolle, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
21
|
Bhargava P, Gaudrain E, Başkent D. Top–down restoration of speech in cochlear-implant users. Hear Res 2014; 309:113-23. [DOI: 10.1016/j.heares.2013.12.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Revised: 11/21/2013] [Accepted: 12/12/2013] [Indexed: 10/25/2022]
|
22
|
Pals C, Sarampalis A, Baskent D. Listening effort with cochlear implant simulations. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2013; 56:1075-1084. [PMID: 23275424 DOI: 10.1044/1092-4388(2012/12-0074)] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
PURPOSE Fitting a cochlear implant (CI) for optimal speech perception does not necessarily optimize listening effort. This study aimed to show that listening effort may change between CI processing conditions for which speech intelligibility remains constant. METHOD Nineteen normal-hearing participants listened to CI simulations with varying numbers of spectral channels. A dual-task paradigm combining an intelligibility task with either a linguistic or nonlinguistic visual response-time (RT) task measured intelligibility and listening effort. The simultaneously performed tasks compete for limited cognitive resources; changes in effort associated with the intelligibility task are reflected in changes in RT on the visual task. A separate self-report scale provided a subjective measure of listening effort. RESULTS All measures showed significant improvements with increasing spectral resolution up to 6 channels. However, only the RT measure of listening effort continued improving up to 8 channels. The effects were stronger for RTs recorded during listening than for RTs recorded between listening. CONCLUSION The results suggest that listening effort decreases with increased spectral resolution. Moreover, these improvements are best reflected in objective measures of listening effort, such as RTs on a secondary task, rather than intelligibility scores or subjective effort measures.
Collapse
Affiliation(s)
- Carina Pals
- University Medical Center Groningen, the Netherlands.
| | | | | |
Collapse
|
23
|
Abstract
The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations.
Collapse
|
24
|
Stilp CE, Goupell MJ, Kluender KR. Speech perception in simulated electric hearing exploits information-bearing acoustic change. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:EL136-EL141. [PMID: 23363194 PMCID: PMC3562329 DOI: 10.1121/1.4776773] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Revised: 12/12/2012] [Indexed: 05/28/2023]
Abstract
Stilp and Kluender [(2010). Proc. Natl. Acad. Sci. U.S.A. 107(27), 12387-12392] reported measures of sensory change over time (cochlea-scaled spectral entropy, CSE) reliably predicted sentence intelligibility for normal-hearing listeners. Here, implications for listeners with atypical hearing were explored using noise-vocoded speech. CSE was parameterized as Euclidean distances between biologically scaled spectra [measured before sentences were noise vocoded (CSE)] or between channel amplitude profiles in simulated cochlear-implant processing [measured after vocoding (CSE(CI))]. Sentence intelligibility worsened with greater amounts of information replaced by noise; patterns of performance did not differ between CSE and CSE(CI). Results demonstrate the importance of information-bearing change for speech perception in simulated electric hearing.
Collapse
Affiliation(s)
- Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA.
| | | | | |
Collapse
|
25
|
Gaudrain E, Carlyon RP. Using Zebra-speech to study sequential and simultaneous speech segregation in a cochlear-implant simulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:502-518. [PMID: 23297922 PMCID: PMC3785145 DOI: 10.1121/1.4770243] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish the target and the masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed.
Collapse
Affiliation(s)
- Etienne Gaudrain
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, CB2 7EF Cambridge, United Kingdom.
| | | |
Collapse
|
26
|
Freyman RL, Griffin AM, Oxenham AJ. Intelligibility of whispered speech in stationary and modulated noise maskers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:2514-23. [PMID: 23039445 PMCID: PMC3477190 DOI: 10.1121/1.4747614] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
This study investigated the role of natural periodic temporal fine structure in helping listeners take advantage of temporal valleys in amplitude-modulated masking noise when listening to speech. Young normal-hearing participants listened to natural, whispered, and/or vocoded nonsense sentences in a variety of masking conditions. Whispering alters normal waveform temporal fine structure dramatically but, unlike vocoding, does not degrade spectral details created by vocal tract resonances. The improvement in intelligibility, or masking release, due to introducing 16-Hz square-wave amplitude modulations in an otherwise steady speech-spectrum noise was reduced substantially with vocoded sentences relative to natural speech, but was not reduced for whispered sentences. In contrast to natural speech, masking release for whispered sentences was observed even at positive signal-to-noise ratios. Whispered speech has a different short-term amplitude distribution relative to natural speech, and this appeared to explain the robust masking release for whispered speech at high signal-to-noise ratios. Recognition of whispered speech was not disproportionately affected by unpredictable modulations created by a speech-envelope modulated noise masker. Overall, the presence or absence of periodic temporal fine structure did not have a major influence on the degree of benefit obtained from imposing temporal fluctuations on a noise masker.
Collapse
Affiliation(s)
- Richard L Freyman
- University of Massachusetts, Department of Communication Disorders, 358 North Pleasant Street, Amherst, Massachusetts 01003, USA.
| | | | | |
Collapse
|
27
|
Effect of speech degradation on top-down repair: phonemic restoration with simulations of cochlear implants and combined electric-acoustic stimulation. J Assoc Res Otolaryngol 2012; 13:683-92. [PMID: 22569838 PMCID: PMC3441953 DOI: 10.1007/s10162-012-0334-3] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2011] [Accepted: 04/24/2012] [Indexed: 11/11/2022] Open
Abstract
The brain, using expectations, linguistic knowledge, and context, can perceptually restore inaudible portions of speech. Such top-down repair is thought to enhance speech intelligibility in noisy environments. Hearing-impaired listeners with cochlear implants commonly complain about not understanding speech in noise. We hypothesized that the degradations in the bottom-up speech signals due to the implant signal processing may have a negative effect on the top-down repair mechanisms, which could partially be responsible for this complaint. To test the hypothesis, phonemic restoration of interrupted sentences was measured with young normal-hearing listeners using a noise-band vocoder simulation of implant processing. Decreasing the spectral resolution (by reducing the number of vocoder processing channels from 32 to 4) systematically degraded the speech stimuli. Supporting the hypothesis, the size of the restoration benefit varied as a function of spectral resolution. A significant benefit was observed only at the highest spectral resolution of 32 channels. With eight channels, which resembles the resolution available to most implant users, there was no significant restoration effect. Combined electric–acoustic hearing has been previously shown to provide better intelligibility of speech in adverse listening environments. In a second configuration, combined electric–acoustic hearing was simulated by adding low-pass-filtered acoustic speech to the vocoder processing. There was a slight improvement in phonemic restoration compared to the first configuration; the restoration benefit was observed at spectral resolutions of both 16 and 32 channels. However, the restoration was not observed at lower spectral resolutions (four or eight channels). Overall, the findings imply that the degradations in the bottom-up signals alone (such as occurs in cochlear implants) may reduce the top-down restoration of speech.
Collapse
|
28
|
Bhargava P, Başkent D. Effects of low-pass filtering on intelligibility of periodically interrupted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:EL87-92. [PMID: 22352622 DOI: 10.1121/1.3670000] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
The combined effect of low-pass filtering (cut-off frequencies between 500 and 3000 Hz) and periodic interruptions (1.5 and 10 Hz) on speech intelligibility was investigated. When combined, intelligibility was lower than each manipulation alone, even in some conditions where there was no effect from a single manipulation (such as the fast interruption rate of 10 Hz). By using young normal-hearing listeners, potential suprathreshold deficits and aging effects that may occur due to hearing impairment were eliminated. Thus, the results imply that reduced audibility of high-frequency speech components may partially explain the reduced intelligibility of interrupted speech in hearing impaired persons.
Collapse
Affiliation(s)
- Pranesh Bhargava
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, P.O. Box 30.001, 9700 RB Groningen, The Netherlands.
| | | |
Collapse
|
29
|
Abstracts. Int J Audiol 2011. [DOI: 10.3109/14992027.2011.588967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
30
|
Roberts B, Summers RJ, Bailey PJ. The intelligibility of noise-vocoded speech: spectral information available from across-channel comparison of amplitude envelopes. Proc Biol Sci 2010; 278:1595-600. [PMID: 21068039 DOI: 10.1098/rspb.2010.1554] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the formant frequencies in the vocal-tract output-a key source of phonetic detail-from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher formants (F3' ≈ N5 + N6), such that the frequency contour of each formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying formants.
Collapse
Affiliation(s)
- Brian Roberts
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK.
| | | | | |
Collapse
|
31
|
Gilbert G, Lorenzi C. Role of spectral and temporal cues in restoring missing speech information. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:EL294-9. [PMID: 21110541 DOI: 10.1121/1.3501962] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
This study measured the role of spectral details and temporal envelope (E) and fine structure (TFS) cues in reconstructing sentences from speech fragments. Four sets of sentences were processed using a 32-band vocoder. Twenty one bands were either processed or removed, leading to sentences differing in their amount of spectral details, E and TFS information. These sentences remained perfectly intelligible, but intelligibility significantly fell after the introduction of periodic silent gaps of 120-ms. While the role of E was unclear, the results unambiguously showed that TFS cues and spectral details influence the ability to reconstruct interrupted sentences.
Collapse
Affiliation(s)
- Gaëtan Gilbert
- Centre de Recherche et de Formation en Audioprothèse, UFR Pharmacie, 15 Avenue Charles Flahault, BP 14491, 34093 Montpellier Cedex 5, France.
| | | |
Collapse
|
32
|
Başkent D, Chatterjee M. Recognition of temporally interrupted and spectrally degraded sentences with additional unprocessed low-frequency speech. Hear Res 2010; 270:127-33. [PMID: 20817081 DOI: 10.1016/j.heares.2010.08.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/02/2010] [Revised: 08/11/2010] [Accepted: 08/20/2010] [Indexed: 11/24/2022]
Abstract
Recognition of periodically interrupted sentences (with an interruption rate of 1.5 Hz, 50% duty cycle) was investigated under conditions of spectral degradation, implemented with a noiseband vocoder, with and without additional unprocessed low-pass filtered speech (cutoff frequency 500 Hz). Intelligibility of interrupted speech decreased with increasing spectral degradation. For all spectral degradation conditions, however, adding the unprocessed low-pass filtered speech enhanced the intelligibility. The improvement at 4 and 8 channels was higher than the improvement at 16 and 32 channels: 19% and 8%, on average, respectively. The Articulation Index predicted an improvement of 0.09, in a scale from 0 to 1. Thus, the improvement at poorest spectral degradation conditions was larger than what would be expected from additional speech information. Therefore, the results implied that the fine temporal cues from the unprocessed low-frequency speech, such as the additional voice pitch cues, helped perceptual integration of temporally interrupted and spectrally degraded speech, especially when the spectral degradations were severe. Considering the vocoder processing as a cochlear implant simulation, where implant users' performance is closest to 4 and 8-channel vocoder performance, the results support additional benefit of low-frequency acoustic input in combined electric-acoustic stimulation for perception of temporally degraded speech.
Collapse
Affiliation(s)
- Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, Groningen, The Netherlands.
| | | |
Collapse
|