1
|
Fletcher MD, Akis E, Verschuur CA, Perry SW. Improved tactile speech perception and noise robustness using audio-to-tactile sensory substitution with amplitude envelope expansion. Sci Rep 2024; 14:15029. [PMID: 38951556 PMCID: PMC11217272 DOI: 10.1038/s41598-024-65510-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 06/20/2024] [Indexed: 07/03/2024] Open
Abstract
Recent advances in haptic technology could allow haptic hearing aids, which convert audio to tactile stimulation, to become viable for supporting people with hearing loss. A tactile vocoder strategy for audio-to-tactile conversion, which exploits these advances, has recently shown significant promise. In this strategy, the amplitude envelope is extracted from several audio frequency bands and used to modulate the amplitude of a set of vibro-tactile tones. The vocoder strategy allows good consonant discrimination, but vowel discrimination is poor and the strategy is susceptible to background noise. In the current study, we assessed whether multi-band amplitude envelope expansion can effectively enhance critical vowel features, such as formants, and improve speech extraction from noise. In 32 participants with normal touch perception, tactile-only phoneme discrimination with and without envelope expansion was assessed both in quiet and in background noise. Envelope expansion improved performance in quiet by 10.3% for vowels and by 5.9% for consonants. In noise, envelope expansion improved overall phoneme discrimination by 9.6%, with no difference in benefit between consonants and vowels. The tactile vocoder with envelope expansion can be deployed in real-time on a compact device and could substantially improve clinical outcomes for a new generation of haptic hearing aids.
Collapse
Affiliation(s)
- Mark D Fletcher
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
| | - Esma Akis
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| | - Carl A Verschuur
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| | - Samuel W Perry
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| |
Collapse
|
2
|
Hartman J, Saffran J, Litovsky R. Word Learning in Deaf Adults Who Use Cochlear Implants: The Role of Talker Variability and Attention to the Mouth. Ear Hear 2024; 45:337-350. [PMID: 37695563 PMCID: PMC10920394 DOI: 10.1097/aud.0000000000001432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
OBJECTIVES Although cochlear implants (CIs) facilitate spoken language acquisition, many CI listeners experience difficulty learning new words. Studies have shown that highly variable stimulus input and audiovisual cues improve speech perception in CI listeners. However, less is known whether these two factors improve perception in a word learning context. Furthermore, few studies have examined how CI listeners direct their gaze to efficiently capture visual information available on a talker's face. The purpose of this study was two-fold: (1) to examine whether talker variability could improve word learning in CI listeners and (2) to examine how CI listeners direct their gaze while viewing a talker speak. DESIGN Eighteen adults with CIs and 10 adults with normal hearing (NH) learned eight novel word-object pairs spoken by a single talker or six different talkers (multiple talkers). The word learning task comprised of nonsense words following the phonotactic rules of English. Learning was probed using a novel talker in a two-alternative forced-choice eye gaze task. Learners' eye movements to the mouth and the target object (accuracy) were tracked over time. RESULTS Both groups performed near ceiling during the test phase, regardless of whether they learned from the same talker or different talkers. However, compared to listeners with NH, CI listeners directed their gaze significantly more to the talker's mouth while learning the words. CONCLUSIONS Unlike NH listeners who can successfully learn words without focusing on the talker's mouth, CI listeners tended to direct their gaze to the talker's mouth, which may facilitate learning. This finding is consistent with the hypothesis that CI listeners use a visual processing strategy that efficiently captures redundant audiovisual speech cues available at the mouth. Due to ceiling effects, however, it is unclear whether talker variability facilitated word learning for adult CI listeners, an issue that should be addressed in future work using more difficult listening conditions.
Collapse
Affiliation(s)
- Jasenia Hartman
- Department of Psychology and Neuroscience, Duke University; Durham, NC 27708
- Neuroscience Training Program, University of Wisconsin-Madison; Madison, WI 53706
| | - Jenny Saffran
- Department of Psychology, University of Wisconsin-Madison; Madison, WI 53706
| | - Ruth Litovsky
- Neuroscience Training Program, University of Wisconsin-Madison; Madison, WI 53706
- Communication and Science Disorders, University of Wisconsin-Madison; Madison, WI 53706
| |
Collapse
|
3
|
Fletcher MD, Akis E, Verschuur CA, Perry SW. Improved tactile speech perception using audio-to-tactile sensory substitution with formant frequency focusing. Sci Rep 2024; 14:4889. [PMID: 38418558 PMCID: PMC10901863 DOI: 10.1038/s41598-024-55429-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 02/23/2024] [Indexed: 03/01/2024] Open
Abstract
Haptic hearing aids, which provide speech information through tactile stimulation, could substantially improve outcomes for both cochlear implant users and for those unable to access cochlear implants. Recent advances in wide-band haptic actuator technology have made new audio-to-tactile conversion strategies viable for wearable devices. One such strategy filters the audio into eight frequency bands, which are evenly distributed across the speech frequency range. The amplitude envelopes from the eight bands modulate the amplitudes of eight low-frequency tones, which are delivered through vibration to a single site on the wrist. This tactile vocoder strategy effectively transfers some phonemic information, but vowels and obstruent consonants are poorly portrayed. In 20 participants with normal touch perception, we tested (1) whether focusing the audio filters of the tactile vocoder more densely around the first and second formant frequencies improved tactile vowel discrimination, and (2) whether focusing filters at mid-to-high frequencies improved obstruent consonant discrimination. The obstruent-focused approach was found to be ineffective. However, the formant-focused approach improved vowel discrimination by 8%, without changing overall consonant discrimination. The formant-focused tactile vocoder strategy, which can readily be implemented in real time on a compact device, could substantially improve speech perception for haptic hearing aid users.
Collapse
Affiliation(s)
- Mark D Fletcher
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
| | - Esma Akis
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| | - Carl A Verschuur
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| | - Samuel W Perry
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| |
Collapse
|
4
|
Bosen AK. Characterizing correlations in partial credit speech recognition scoring with beta-binomial distributions. JASA EXPRESS LETTERS 2024; 4:025202. [PMID: 38299983 PMCID: PMC10848658 DOI: 10.1121/10.0024633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 01/12/2024] [Indexed: 02/02/2024]
Abstract
Partial credit scoring for speech recognition tasks can improve measurement precision. However, assessing the magnitude of this improvement with partial credit scoring is challenging because meaningful speech contains contextual cues, which create correlations between the probabilities of correctly identifying each token in a stimulus. Here, beta-binomial distributions were used to estimate recognition accuracy and intraclass correlation for phonemes in words and words in sentences in listeners with cochlear implants (N = 20). Estimates demonstrated substantial intraclass correlation in recognition accuracy within stimuli. These correlations were invariant across individuals. Intraclass correlations should be addressed in power analysis of partial credit scoring.
Collapse
Affiliation(s)
- Adam K Bosen
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131,
| |
Collapse
|
5
|
Nourski KV, Steinschneider M, Rhone AE, Berger JI, Dappen ER, Kawasaki H, Howard III MA. Intracranial electrophysiology of spectrally degraded speech in the human cortex. Front Hum Neurosci 2024; 17:1334742. [PMID: 38318272 PMCID: PMC10839784 DOI: 10.3389/fnhum.2023.1334742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 12/28/2023] [Indexed: 02/07/2024] Open
Abstract
Introduction Cochlear implants (CIs) are the treatment of choice for severe to profound hearing loss. Variability in CI outcomes remains despite advances in technology and is attributed in part to differences in cortical processing. Studying these differences in CI users is technically challenging. Spectrally degraded stimuli presented to normal-hearing individuals approximate input to the central auditory system in CI users. This study used intracranial electroencephalography (iEEG) to investigate cortical processing of spectrally degraded speech. Methods Participants were adult neurosurgical epilepsy patients. Stimuli were utterances /aba/ and /ada/, spectrally degraded using a noise vocoder (1-4 bands) or presented without vocoding. The stimuli were presented in a two-alternative forced choice task. Cortical activity was recorded using depth and subdural iEEG electrodes. Electrode coverage included auditory core in posteromedial Heschl's gyrus (HGPM), superior temporal gyrus (STG), ventral and dorsal auditory-related areas, and prefrontal and sensorimotor cortex. Analysis focused on high gamma (70-150 Hz) power augmentation and alpha (8-14 Hz) suppression. Results Chance task performance occurred with 1-2 spectral bands and was near-ceiling for clear stimuli. Performance was variable with 3-4 bands, permitting identification of good and poor performers. There was no relationship between task performance and participants demographic, audiometric, neuropsychological, or clinical profiles. Several response patterns were identified based on magnitude and differences between stimulus conditions. HGPM responded strongly to all stimuli. A preference for clear speech emerged within non-core auditory cortex. Good performers typically had strong responses to all stimuli along the dorsal stream, including posterior STG, supramarginal, and precentral gyrus; a minority of sites in STG and supramarginal gyrus had a preference for vocoded stimuli. In poor performers, responses were typically restricted to clear speech. Alpha suppression was more pronounced in good performers. In contrast, poor performers exhibited a greater involvement of posterior middle temporal gyrus when listening to clear speech. Discussion Responses to noise-vocoded speech provide insights into potential factors underlying CI outcome variability. The results emphasize differences in the balance of neural processing along the dorsal and ventral stream between good and poor performers, identify specific cortical regions that may have diagnostic and prognostic utility, and suggest potential targets for neuromodulation-based CI rehabilitation strategies.
Collapse
Affiliation(s)
- Kirill V. Nourski
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, United States
| | - Mitchell Steinschneider
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Ariane E. Rhone
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
| | - Joel I. Berger
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
| | - Emily R. Dappen
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, United States
| | - Hiroto Kawasaki
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
| | - Matthew A. Howard III
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, United States
- Pappajohn Biomedical Institute, The University of Iowa, Iowa City, IA, United States
| |
Collapse
|
6
|
Fletcher MD, Verschuur CA, Perry SW. Improving speech perception for hearing-impaired listeners using audio-to-tactile sensory substitution with multiple frequency channels. Sci Rep 2023; 13:13336. [PMID: 37587166 PMCID: PMC10432540 DOI: 10.1038/s41598-023-40509-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 08/11/2023] [Indexed: 08/18/2023] Open
Abstract
Cochlear implants (CIs) have revolutionised treatment of hearing loss, but large populations globally cannot access them either because of disorders that prevent implantation or because they are expensive and require specialist surgery. Recent technology developments mean that haptic aids, which transmit speech through vibration, could offer a viable low-cost, non-invasive alternative. One important development is that compact haptic actuators can now deliver intense stimulation across multiple frequencies. We explored whether these multiple frequency channels can transfer spectral information to improve tactile phoneme discrimination. To convert audio to vibration, the speech amplitude envelope was extracted from one or more audio frequency bands and used to amplitude modulate one or more vibro-tactile tones delivered to a single-site on the wrist. In 26 participants with normal touch sensitivity, tactile-only phoneme discrimination was assessed with one, four, or eight frequency bands. Compared to one frequency band, performance improved by 5.9% with four frequency bands and by 8.4% with eight frequency bands. The multi-band signal-processing approach can be implemented in real-time on a compact device, and the vibro-tactile tones can be reproduced by the latest compact, low-powered actuators. This approach could therefore readily be implemented in a low-cost haptic hearing aid to deliver real-world benefits.
Collapse
Affiliation(s)
- Mark D Fletcher
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
| | - Carl A Verschuur
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| | - Samuel W Perry
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
- Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| |
Collapse
|
7
|
Yang J, Wang X, Yu J, Xu L. Intelligibility of Word-Initial Obstruent Consonants in Mandarin-Speaking Prelingually Deafened Children With Cochlear Implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023:1-22. [PMID: 37208163 DOI: 10.1044/2023_jslhr-22-00268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
PURPOSE This study assessed the intelligibility of obstruent consonants in prelingually deafened Mandarin-speaking children with cochlear implants (CIs). METHOD Twenty-two Mandarin-speaking children with normal hearing (NH) aged 3.25-10.0 years and 35 Mandarin-speaking children with CIs aged 3.77-15.0 years were recruited to produce a list of Mandarin words composed of 17 word-initial obstruent consonants in different vowel contexts. The children with CIs were assigned to chronological age-matched (CA) and hearing age-matched (HA) subgroups with reference to the NH controls. One hundred naïve NH adult listeners were recruited for a consonant identification task that consisted of a total of 2,663 stimulus tokens through an online research platform. For each child speaker, the consonant productions were judged by seven to 12 different adult listeners. An average percentage of consonants correct was calculated across all listeners for each consonant. RESULTS The CI children in both the CA and HA subgroups showed lower intelligibility in their consonant productions than the NH controls. Among the 17 obstruents, both CI subgroups showed higher intelligibility for stops, but they demonstrated major problems with the sibilant fricatives and affricates and showed a different confusion pattern from the NH controls on these sibilants. Of the three places (alveolar, alveolopalatal, and retroflex) in Mandarin sibilants, both CI subgroups showed the lowest intelligibility and the greatest difficulties with alveolar sounds. For the NH children, there was a significant positive relationship between overall consonant intelligibility and chronological age. For the children with CIs, the best fit regression model revealed significant effects of chronological age and age at implantation, with their quadratic terms included. CONCLUSIONS Mandarin-speaking children with CIs experience major challenges in the three-way place contrasts of sibilant sounds in consonant production. Chronological age and the combined effect of CI-related time variables play important roles in the development of obstruent consonants in the CI children.
Collapse
Affiliation(s)
- Jing Yang
- Program of Communication Sciences and Disorders, University of Wisconsin-Milwaukee
| | - Xianhui Wang
- Hearing, Speech and Language Sciences , Ohio University, Athens
| | - Jue Yu
- Center for Speech and Language Processing, School of Foreign Languages, Tongji University, Shanghai, China
| | - Li Xu
- Hearing, Speech and Language Sciences , Ohio University, Athens
| |
Collapse
|
8
|
Bochner J, Samar V, Prud'hommeaux E, Huenerfauth M. Phoneme Categorization in Prelingually Deaf Adult Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:4429-4453. [PMID: 36279201 DOI: 10.1044/2022_jslhr-22-00038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
PURPOSE Phoneme categorization (PC) for voice onset time and second formant transition was studied in adult cochlear implant (CI) users with early-onset deafness and hearing controls. METHOD Identification and discrimination tasks were administered to 30 participants implanted before 4 years of age, 21 participants implanted after 7 years of age, and 21 hearing individuals. RESULTS Distinctive identification and discrimination functions confirmed PC within all groups. Compared to hearing participants, the CI groups generally displayed longer/higher category boundaries, shallower identification function slopes, reduced identification consistency, and reduced discrimination performance. A principal component analysis revealed that identification consistency, discrimination accuracy, and identification function slope, but not boundary location, loaded on a single factor, reflecting general PC performance. Earlier implantation was associated with better PC performance within the early CI group, but not the late CI group. Within the early CI group, earlier implantation age but not PC performance was associated with better speech recognition. Conversely, within the late CI group, better PC performance but not earlier implantation age was associated with better speech recognition. CONCLUSIONS Results suggest that implantation timing within the sensitive period before 4 years of age partly determines the level of PC performance. They also suggest that early implantation may promote development of higher level processes that can compensate for relatively poor PC performance, as can occur in challenging listening conditions.
Collapse
Affiliation(s)
- Joseph Bochner
- National Technical Institute for the Deaf, Rochester Institute of Technology, NY
| | - Vincent Samar
- National Technical Institute for the Deaf, Rochester Institute of Technology, NY
| | | | - Matt Huenerfauth
- Golisano College of Computing and Information Sciences, Rochester Institute of Technology, NY
| |
Collapse
|
9
|
Winn MB, Teece KH. Effortful Listening Despite Correct Responses: The Cost of Mental Repair in Sentence Recognition by Listeners With Cochlear Implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:3966-3980. [PMID: 36112516 PMCID: PMC9927629 DOI: 10.1044/2022_jslhr-21-00631] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 04/20/2022] [Accepted: 06/24/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE Speech recognition percent correct scores fail to capture the effort of mentally repairing the perception of speech that was initially misheard. This study measured the effort of listening to stimuli specifically designed to elicit mental repair in adults who use cochlear implants (CIs). METHOD CI listeners heard and repeated sentences in which specific words were distorted or masked by noise but recovered based on later context: a signature of mental repair. Changes in pupil dilation were tracked as an index of effort and time-locked with specific landmarks during perception. RESULTS Effort significantly increases when a listener needs to repair a misperceived word, even if the verbal response is ultimately correct. Mental repair of words in a sentence was accompanied by greater prevalence of errors elsewhere in the same sentence, suggesting that effort spreads to consume resources across time. The cost of mental repair in CI listeners was essentially the same as that observed in listeners with normal hearing in previous work. CONCLUSIONS Listening effort as tracked by pupil dilation is better explained by the mental repair and reconstruction of words rather than the appearance of correct or incorrect perception. Linguistic coherence drives effort more heavily than the mere presence of mistakes, highlighting the importance of testing materials that do not constrain coherence by design.
Collapse
Affiliation(s)
- Matthew B. Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| | - Katherine H. Teece
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| |
Collapse
|
10
|
Eshaghi M, Darouie A, Teymouri R. The Auditory Perception of Consonant Contrasts in Cochlear Implant Children. Indian J Otolaryngol Head Neck Surg 2022; 74:455-459. [PMID: 36032915 PMCID: PMC9411492 DOI: 10.1007/s12070-020-02250-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 10/24/2020] [Indexed: 11/30/2022] Open
Abstract
Background and ObjectivesA major part of speech perception is based on understanding and distinguishing between vocal cues in the speaker's speech. Consonants and vowels are vocal cues that can be affected by hearing impairment and their perception may thus be reduced or distorted. The present study aims to investigate the auditory perception of consonant contrasts in cochlear implant children. Materials and Methods The present cross-sectional, descriptive-analytical study was conducted on 24 cochlear implant children aged 9-13 selected through convenience sampling from schools and cochlear implant centers. A test of non-word pairs based on a study conducted by Khavar-Ghazlani was carried out to measure contrast in consonants, place of and manner of articulation and voicing. Results The results of the test showed that cochlear implant children scored lower in the perception of voicing compared to the other two features. No significant differences were observed between their perceptions of place of articulation and manner of articulation. Conclusion Cochlear implant children appear to have a poorer perception of voicing contrast compared to the other features, which may be due to the greater reliance of this feature on auditory signs.
Collapse
|
11
|
Arjmandi MK, Jahn KN, Arenberg JG. Single-Channel Focused Thresholds Relate to Vowel Identification in Pediatric and Adult Cochlear Implant Listeners. Trends Hear 2022; 26:23312165221095364. [PMID: 35505617 PMCID: PMC9073113 DOI: 10.1177/23312165221095364] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Speech recognition outcomes are highly variable among pediatric and adult cochlear implant (CI) listeners. Although there is some evidence that the quality of the electrode-neuron interface (ENI) contributes to this large variability in auditory perception, its relationship with speech outcomes is not well understood. Single-channel auditory detection thresholds measured in response to focused electrical fields (i.e., focused thresholds) are sensitive to properties of ENI quality, including electrode-neuron distance, intracochlear resistance, and neural health. In the present study, focused thresholds and speech perception abilities were assessed in 15 children and 21 adult CI listeners. Focused thresholds were measured for all active electrodes using a fast sweep procedure. Speech perception performance was evaluated by assessing listeners’ ability to identify vowels presented in /h-vowel-d/ context. Consistent with prior literature, focused thresholds were lower for children than for adults, but vowel identification did not differ significantly across age groups. Higher across-array average focused thresholds, which may indicate a relatively poor ENI quality, were associated with poorer vowel identification scores in both children and adults. Adult CI listeners with longer durations of deafness had higher focused thresholds. Findings from this study demonstrate that poor-quality ENIs may contribute to reduced speech outcomes for pediatric and adult CI listeners. Estimates of ENI quality (e.g., focused thresholds) may assist in developing customized programming interventions that serve to improve the transmission of spectral cues that are important in vowel identification.
Collapse
Affiliation(s)
- Meisam K Arjmandi
- Department of Otolaryngology - Head and Neck Surgery, 1811Harvard Medical School, Boston, MA, USA.,Eaton-Peabody Laboratories, 1866Massachusetts Eye and Ear, Boston, MA, USA.,Audiology Division, 1866Massachusetts Eye and Ear, Boston, MA, USA
| | - Kelly N Jahn
- Department of Otolaryngology - Head and Neck Surgery, 1811Harvard Medical School, Boston, MA, USA.,Eaton-Peabody Laboratories, 1866Massachusetts Eye and Ear, Boston, MA, USA.,Department of Speech, Language, and Hearing, University of Texas at Dallas, Richardson, TX, USA
| | - Julie G Arenberg
- Department of Otolaryngology - Head and Neck Surgery, 1811Harvard Medical School, Boston, MA, USA.,Eaton-Peabody Laboratories, 1866Massachusetts Eye and Ear, Boston, MA, USA.,Audiology Division, 1866Massachusetts Eye and Ear, Boston, MA, USA
| |
Collapse
|
12
|
Winn MB, O'Brien G. Distortion of Spectral Ripples Through Cochlear Implants Has Major Implications for Interpreting Performance Scores. Ear Hear 2021; 43:764-772. [PMID: 34966157 PMCID: PMC9010354 DOI: 10.1097/aud.0000000000001162] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The spectral ripple discrimination task is a psychophysical measure that has been found to correlate with speech recognition in listeners with cochlear implants (CIs). However, at ripple densities above a critical value (around 2 RPO, but device-specific), the sparse spectral sampling of CI processors results in stimulus distortions resulting in aliasing and unintended changes in modulation depth. As a result, spectral ripple thresholds above a certain number are not ordered monotonically along the RPO dimension and thus cannot be considered better or worse spectral resolution than each other, thus undermining correlation measurements. These stimulus distortions are not remediated by changing stimulus phase, indicating these issues cannot be solved by spectrotemporally modulated stimuli. Speech generally has very low-density spectral modulations, leading to questions about the mechanism of correlation between high ripple thresholds and speech recognition. Existing data showing correlations between ripple discrimination and speech recognition include many observations above the aliasing limit. These scores should be treated with caution, and experimenters could benefit by prospectively considering the limitations of the spectral ripple test.
Collapse
Affiliation(s)
- Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minnesota, USA School of Information, University of Michigan, Ann Arbor, Michigan, USA
| | | |
Collapse
|
13
|
van Wieringen A, Magits S, Francart T, Wouters J. Home-Based Speech Perception Monitoring for Clinical Use With Cochlear Implant Users. Front Neurosci 2021; 15:773427. [PMID: 34916902 PMCID: PMC8669965 DOI: 10.3389/fnins.2021.773427] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 10/28/2021] [Indexed: 12/02/2022] Open
Abstract
Speech-perception testing is essential for monitoring outcomes with a hearing aid or cochlear implant (CI). However, clinical care is time-consuming and often challenging with an increasing number of clients. A potential approach to alleviating some clinical care and possibly making room for other outcome measures is to employ technologies that assess performance in the home environment. In this study, we investigate 3 different speech perception indices in the same 40 CI users: phoneme identification (vowels and consonants), digits in noise (DiN) and sentence recognition in noise (SiN). The first two tasks were implemented on a tablet and performed multiple times by each client in their home environment, while the sentence task was administered at the clinic. Speech perception outcomes in the same forty CI users showed that DiN assessed at home can serve as an alternative to SiN assessed at the clinic. DiN scores are in line with the SiN ones by 3–4 dB improvement and are useful to monitor performance at regular intervals and to detect changes in auditory performance. Phoneme identification in quiet also explains a significant part of speech perception in noise, and provides additional information on the detectability and discriminability of speech cues. The added benefit of the phoneme identification task, which also proved to be easy to administer at home, is the information transmission analysis in addition to the summary score. Performance changes for the different indices can be interpreted by comparing against measurement error and help to target personalized rehabilitation. Altogether, home-based speech testing is reliable and proves powerful to complement care in the clinic for CI users.
Collapse
Affiliation(s)
| | - Sara Magits
- Experimental ORL, Department of Neurosciences, KU Leuven, Leuven, Belgium
| | - Tom Francart
- Experimental ORL, Department of Neurosciences, KU Leuven, Leuven, Belgium
| | - Jan Wouters
- Experimental ORL, Department of Neurosciences, KU Leuven, Leuven, Belgium
| |
Collapse
|
14
|
Abstract
Listening effort is a valuable and important notion to measure because it is among the primary complaints of people with hearing loss. It is tempting and intuitive to accept speech intelligibility scores as a proxy for listening effort, but this link is likely oversimplified and lacks actionable explanatory power. This study was conducted to explain the mechanisms of listening effort that are not captured by intelligibility scores, using sentence-repetition tasks where specific kinds of mistakes were prospectively planned or analyzed retrospectively. Effort measured as changes in pupil size among 20 listeners with normal hearing and 19 listeners with cochlear implants. Experiment 1 demonstrates that mental correction of misperceived words increases effort even when responses are correct. Experiment 2 shows that for incorrect responses, listening effort is not a function of the proportion of words correct but is rather driven by the types of errors, position of errors within a sentence, and the need to resolve ambiguity, reflecting how easily the listener can make sense of a perception. A simple taxonomy of error types is provided that is both intuitive and consistent with data from these two experiments. The diversity of errors in these experiments implies that speech perception tasks can be designed prospectively to elicit the mistakes that are more closely linked with effort. Although mental corrective action and number of mistakes can scale together in many experiments, it is possible to dissociate them to advance toward a more explanatory (rather than correlational) account of listening effort.
Collapse
Affiliation(s)
- Matthew B. Winn
- Matthew B. Winn, University of Minnesota, Twin Cities, 164 Pillsbury Dr SE, Minneapolis, MN Minnesota 55455, United States.
| | | |
Collapse
|
15
|
Individual Variability in Recalibrating to Spectrally Shifted Speech: Implications for Cochlear Implants. Ear Hear 2021; 42:1412-1427. [PMID: 33795617 DOI: 10.1097/aud.0000000000001043] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Cochlear implant (CI) recipients are at a severe disadvantage compared with normal-hearing listeners in distinguishing consonants that differ by place of articulation because the key relevant spectral differences are degraded by the implant. One component of that degradation is the upward shifting of spectral energy that occurs with a shallow insertion depth of a CI. The present study aimed to systematically measure the effects of spectral shifting on word recognition and phoneme categorization by specifically controlling the amount of shifting and using stimuli whose identification specifically depends on perceiving frequency cues. We hypothesized that listeners would be biased toward perceiving phonemes that contain higher-frequency components because of the upward frequency shift and that intelligibility would decrease as spectral shifting increased. DESIGN Normal-hearing listeners (n = 15) heard sine wave-vocoded speech with simulated upward frequency shifts of 0, 2, 4, and 6 mm of cochlear space to simulate shallow CI insertion depth. Stimuli included monosyllabic words and /b/-/d/ and /∫/-/s/ continua that varied systematically by formant frequency transitions or frication noise spectral peaks, respectively. Recalibration to spectral shifting was operationally defined as shifting perceptual acoustic-phonetic mapping commensurate with the spectral shift. In other words, adjusting frequency expectations for both phonemes upward so that there is still a perceptual distinction, rather than hearing all upward-shifted phonemes as the higher-frequency member of the pair. RESULTS For moderate amounts of spectral shifting, group data suggested a general "halfway" recalibration to spectral shifting, but individual data suggested a notably different conclusion: half of the listeners were able to recalibrate fully, while the other halves of the listeners were utterly unable to categorize shifted speech with any reliability. There were no participants who demonstrated a pattern intermediate to these two extremes. Intelligibility of words decreased with greater amounts of spectral shifting, also showing loose clusters of better- and poorer-performing listeners. Phonetic analysis of word errors revealed certain cues were more susceptible to being compromised due to a frequency shift (place and manner of articulation), while voicing was robust to spectral shifting. CONCLUSIONS Shifting the frequency spectrum of speech has systematic effects that are in line with known properties of speech acoustics, but the ensuing difficulties cannot be predicted based on tonotopic mismatch alone. Difficulties are subject to substantial individual differences in the capacity to adjust acoustic-phonetic mapping. These results help to explain why speech recognition in CI listeners cannot be fully predicted by peripheral factors like electrode placement and spectral resolution; even among listeners with functionally equivalent auditory input, there is an additional factor of simply being able or unable to flexibly adjust acoustic-phonetic mapping. This individual variability could motivate precise treatment approaches guided by an individual's relative reliance on wideband frequency representation (even if it is mismatched) or limited frequency coverage whose tonotopy is preserved.
Collapse
|
16
|
Abstract
Sequences of phonologically similar words are more difficult to remember than phonologically distinct sequences. This study investigated whether this difficulty arises in the acoustic similarity of auditory stimuli or in the corresponding phonological labels in memory. Participants reconstructed sequences of words which were degraded with a vocoder. We manipulated the phonological similarity of response options across two groups. One group was trained to map stimulus words onto phonologically similar response labels which matched the recorded word; the other group was trained to map words onto a set of plausible responses which were mismatched from the original recordings but were selected to have less phonological overlap. Participants trained on the matched responses were able to learn responses with less training and recall sequences more accurately than participants trained on the mismatched responses, even though the mismatched responses were more phonologically distinct from one another and participants were unaware of the mismatch. The relative difficulty of recalling items in the correct position was the same across both sets of response labels. Mismatched responses impaired recall accuracy across all positions except the final item in each list. These results are consistent with the idea that increased difficulty of mapping acoustic stimuli onto phonological forms impairs serial recall. Increased mapping difficulty could impair retention of memoranda and impede consolidation into phonological forms, which would impair recall in adverse listening conditions.
Collapse
Affiliation(s)
- Adam K Bosen
- Hearing and Speech Perception, Boys Town National Research Hospital, Omaha, NE, USA
| | - Elizabeth Monzingo
- Hearing and Speech Perception, Boys Town National Research Hospital, Omaha, NE, USA
| | - Angela M AuBuchon
- Hearing and Speech Perception, Boys Town National Research Hospital, Omaha, NE, USA
| |
Collapse
|
17
|
Winn MB, Moore AN. Perceptual weighting of acoustic cues for accommodating gender-related talker differences heard by listeners with normal hearing and with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:496. [PMID: 32873011 PMCID: PMC7402726 DOI: 10.1121/10.0001672] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 05/31/2020] [Accepted: 07/14/2020] [Indexed: 06/11/2023]
Abstract
Listeners must accommodate acoustic differences between vocal tracts and speaking styles of conversation partners-a process called normalization or accommodation. This study explores what acoustic cues are used to make this perceptual adjustment by listeners with normal hearing or with cochlear implants, when the acoustic variability is related to the talker's gender. A continuum between /ʃ/ and /s/ was paired with naturally spoken vocalic contexts that were parametrically manipulated to vary by numerous cues for talker gender including fundamental frequency (F0), vocal tract length (formant spacing), and direct spectral contrast with the fricative. The goal was to examine relative contributions of these cues toward the tendency to have a lower-frequency acoustic boundary for fricatives spoken by men (found in numerous previous studies). Normal hearing listeners relied primarily on formant spacing and much less on F0. The CI listeners were individually variable, with the F0 cue emerging as the strongest cue on average.
Collapse
Affiliation(s)
- Matthew B Winn
- Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Ashley N Moore
- Department of Speech & Hearing Sciences, University of Washington, Seattle, Washington 98105, USA
| |
Collapse
|
18
|
DiNino M, Arenberg JG, Duchen ALR, Winn MB. Effects of Age and Cochlear Implantation on Spectrally Cued Speech Categorization. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2425-2440. [PMID: 32552327 PMCID: PMC7838840 DOI: 10.1044/2020_jslhr-19-00127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 08/12/2019] [Accepted: 03/30/2020] [Indexed: 06/11/2023]
Abstract
Purpose Weighting of acoustic cues for perceiving place-of-articulation speech contrasts was measured to determine the separate and interactive effects of age and use of cochlear implants (CIs). It has been found that adults with normal hearing (NH) show reliance on fine-grained spectral information (e.g., formants), whereas adults with CIs show reliance on broad spectral shape (e.g., spectral tilt). In question was whether children with NH and CIs would demonstrate the same patterns as adults, or show differences based on ongoing maturation of hearing and phonetic skills. Method Children and adults with NH and with CIs categorized a /b/-/d/ speech contrast based on two orthogonal spectral cues. Among CI users, phonetic cue weights were compared to vowel identification scores and Spectral-Temporally Modulated Ripple Test thresholds. Results NH children and adults both relied relatively more on the fine-grained formant cue and less on the broad spectral tilt cue compared to participants with CIs. However, early-implanted children with CIs better utilized the formant cue compared to adult CI users. Formant cue weights correlated with CI participants' vowel recognition and in children, also related to Spectral-Temporally Modulated Ripple Test thresholds. Adults and child CI users with very poor phonetic perception showed additive use of the two cues, whereas those with better and/or more mature cue usage showed a prioritized trading relationship, akin to NH listeners. Conclusions Age group and hearing modality can influence phonetic cue-weighting patterns. Results suggest that simple nonlexical categorization tests correlate with more general speech recognition skills of children and adults with CIs.
Collapse
Affiliation(s)
- Mishaela DiNino
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA
| | - Julie G. Arenberg
- Massachusetts Eye and Ear, Harvard Medical School Department of Otolaryngology, Boston
| | | | - Matthew B. Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis
| |
Collapse
|
19
|
Gianakas SP, Winn MB. Lexical bias in word recognition by cochlear implant listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3373. [PMID: 31795696 PMCID: PMC6948217 DOI: 10.1121/1.5132938] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 10/04/2019] [Accepted: 10/14/2019] [Indexed: 06/03/2023]
Abstract
When hearing an ambiguous speech sound, listeners show a tendency to perceive it as a phoneme that would complete a real word, rather than completing a nonsense/fake word. For example, a sound that could be heard as either /b/ or /ɡ/ is perceived as /b/ when followed by _ack but perceived as /ɡ/ when followed by "_ap." Because the target sound is acoustically identical across both environments, this effect demonstrates the influence of top-down lexical processing in speech perception. Degradations in the auditory signal were hypothesized to render speech stimuli more ambiguous, and therefore promote increased lexical bias. Stimuli included three speech continua that varied by spectral cues of varying speeds, including stop formant transitions (fast), fricative spectra (medium), and vowel formants (slow). Stimuli were presented to listeners with cochlear implants (CIs), and also to listeners with normal hearing with clear spectral quality, or with varying amounts of spectral degradation using a noise vocoder. Results indicated an increased lexical bias effect with degraded speech and for CI listeners, for whom the effect size was related to segment duration. This method can probe an individual's reliance on top-down processing even at the level of simple lexical/phonetic perception.
Collapse
Affiliation(s)
- Steven P Gianakas
- Department of Speech-Language-Hearing Sciences, University of Minnesota, 164 Pillsbury Drive SE, Minneapolis, Minnesota 55455, USA
| | - Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, 164 Pillsbury Drive SE, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
20
|
Rødvik AK, Tvete O, Torkildsen JVK, Wie OB, Skaug I, Silvola JT. Consonant and Vowel Confusions in Well-Performing Children and Adolescents With Cochlear Implants, Measured by a Nonsense Syllable Repetition Test. Front Psychol 2019; 10:1813. [PMID: 31474900 PMCID: PMC6702790 DOI: 10.3389/fpsyg.2019.01813] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2019] [Accepted: 07/22/2019] [Indexed: 12/31/2022] Open
Abstract
Although the majority of early implanted, profoundly deaf children with cochlear implants (CIs), will develop correct pronunciation if they receive adequate oral language stimulation, many of them have difficulties with perceiving minute details of speech. The main aim of this study is to measure the confusion of consonants and vowels in well-performing children and adolescents with CIs. The study also aims to investigate how age at onset of severe to profound deafness influences perception. The participants are 36 children and adolescents with CIs (18 girls), with a mean (SD) age of 11.6 (3.0) years (range: 5.9-16.0 years). Twenty-nine of them are prelingually deaf and seven are postlingually deaf. Two reference groups of normal-hearing (NH) 6- and 13-year-olds are included. Consonant and vowel perception is measured by repetition of 16 bisyllabic vowel-consonant-vowel nonsense words and nine monosyllabic consonant-vowel-consonant nonsense words in an open-set design. For the participants with CIs, consonants were mostly confused with consonants with the same voicing and manner, and the mean (SD) voiced consonant repetition score, 63.9 (10.6)%, was considerably lower than the mean (SD) unvoiced consonant score, 76.9 (9.3)%. There was a devoicing bias for the stops; unvoiced stops were confused with other unvoiced stops and not with voiced stops, and voiced stops were confused with both unvoiced stops and other voiced stops. The mean (SD) vowel repetition score was 85.2 (10.6)% and there was a bias in the confusions of [i:] and [y:]; [y:] was perceived as [i:] twice as often as [y:] was repeated correctly. Subgroup analyses showed no statistically significant differences between the consonant scores for pre- and postlingually deaf participants. For the NH participants, the consonant repetition scores were substantially higher and the difference between voiced and unvoiced consonant repetition scores considerably lower than for the participants with CIs. The participants with CIs obtained scores close to ceiling on vowels and real-word monosyllables, but their perception was substantially lower for voiced consonants. This may partly be related to limitations in the CI technology for the transmission of low-frequency sounds, such as insertion depth of the electrode and ability to convey temporal information.
Collapse
Affiliation(s)
- Arne Kirkhorn Rødvik
- Department of Special Needs Education, Institute of Educational Sciences, University of Oslo, Oslo, Norway.,Cochlear Implant Unit, Department of Otorhinolaryngology, Division of Surgery and Clinical Neuroscience, Oslo University Hospital, Oslo, Norway
| | - Ole Tvete
- Cochlear Implant Unit, Department of Otorhinolaryngology, Division of Surgery and Clinical Neuroscience, Oslo University Hospital, Oslo, Norway
| | - Janne von Koss Torkildsen
- Department of Special Needs Education, Institute of Educational Sciences, University of Oslo, Oslo, Norway
| | - Ona Bø Wie
- Department of Special Needs Education, Institute of Educational Sciences, University of Oslo, Oslo, Norway.,Cochlear Implant Unit, Department of Otorhinolaryngology, Division of Surgery and Clinical Neuroscience, Oslo University Hospital, Oslo, Norway
| | | | - Juha Tapio Silvola
- Department of Special Needs Education, Institute of Educational Sciences, University of Oslo, Oslo, Norway.,Cochlear Implant Unit, Department of Otorhinolaryngology, Division of Surgery and Clinical Neuroscience, Oslo University Hospital, Oslo, Norway.,Ear, Nose, and Throat Department, Division of Surgery, Akershus University Hospital, Lørenskog, Norway
| |
Collapse
|
21
|
Anis FN, Umat C, Ahmad K, Hamid BA. Patterns of recognition of Arabic consonants by non-native children with cochlear implants and normal hearing. Cochlear Implants Int 2018; 20:12-22. [PMID: 30293522 DOI: 10.1080/14670100.2018.1530420] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
OBJECTIVE This study examined the patterns of recognition of Arabic consonants, via information transmission analysis for phonological features, in a group of Malay children with normal hearing (NH) and cochlear implants (CI). METHOD A total of 336 and 616 acoustic tokens were collected from six CI and 11 NH Malay children, respectively. The groups were matched for hearing age and duration of exposure to Arabic sounds. All the 28 Arabic consonants in the form of consonant-vowel /a/ were presented randomly twice via a loudspeaker at approximately 65 dB SPL. The participants were asked to repeat verbally the stimulus heard in each presentation. RESULTS Within the native Malay perceptual space, the two groups responded differently to the Arabic consonants. The dispersed uncategorized assimilation in the CI group was distinct in the confusion matrix (CM), as compared to the NH children. Consonants /ħ/, /tˁ/, /sˁ/ and /ʁ/ were difficult for the CI children, while the most accurate item was /k/ (84%). The CI group transmitted significantly reduced information, especially for place feature transmission, then the NH group (p < 0.001). Significant interactions between place-hearing status and manner-hearing status were also obtained, suggesting there were information transmission differences in the pattern of consonants recognition between the study groups. CONCLUSION CI and NH Malay children may be using different acoustic cues to recognize Arabic sounds, which contribute to the different assimilation categories' patterns within the Malay perceptual space.
Collapse
Affiliation(s)
- Farheen Naz Anis
- a Centre For Rehabilitation and Special Needs, Faculty of Health Sciences , Universiti Kebangsaan Malaysia , Jalan Raja Muda Abdul Aziz 50300 , Kuala Lumpur , Malaysia
| | - Cila Umat
- a Centre For Rehabilitation and Special Needs, Faculty of Health Sciences , Universiti Kebangsaan Malaysia , Jalan Raja Muda Abdul Aziz 50300 , Kuala Lumpur , Malaysia.,b Institute of Ear, Hearing & Speech, Universiti Kebangsaan Malaysia , Kuala Lumpur , Malaysia
| | - Kartini Ahmad
- a Centre For Rehabilitation and Special Needs, Faculty of Health Sciences , Universiti Kebangsaan Malaysia , Jalan Raja Muda Abdul Aziz 50300 , Kuala Lumpur , Malaysia
| | - Badrulzaman Abdul Hamid
- a Centre For Rehabilitation and Special Needs, Faculty of Health Sciences , Universiti Kebangsaan Malaysia , Jalan Raja Muda Abdul Aziz 50300 , Kuala Lumpur , Malaysia
| |
Collapse
|
22
|
Objective Identification of Simulated Cochlear Implant Settings in Normal-Hearing Listeners Via Auditory Cortical Evoked Potentials. Ear Hear 2018; 38:e215-e226. [PMID: 28125444 DOI: 10.1097/aud.0000000000000403] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Providing cochlear implant (CI) patients the optimal signal processing settings during mapping sessions is critical for facilitating their speech perception. Here, we aimed to evaluate whether auditory cortical event-related potentials (ERPs) could be used to objectively determine optimal CI parameters. DESIGN While recording neuroelectric potentials, we presented a set of acoustically vocoded consonants (aKa, aSHa, and aNa) to normal-hearing listeners (n = 12) that simulated speech tokens processed through four different combinations of CI stimulation rate and number of spectral maxima. Parameter settings were selected to feature relatively fast/slow stimulation rates and high/low number of maxima; 1800 pps/20 maxima, 1800/8, 500/20 and 500/8. RESULTS Speech identification and reaction times did not differ with changes in either the number of maxima or stimulation rate indicating ceiling behavioral performance. Similarly, we found that conventional univariate analysis (analysis of variance) of N1 and P2 amplitude/latency failed to reveal strong modulations across CI-processed speech conditions. In contrast, multivariate discriminant analysis based on a combination of neural measures was used to create "neural confusion matrices" and identified a unique parameter set (1800/8) that maximally differentiated speech tokens at the neural level. This finding was corroborated by information transfer analysis which confirmed these settings optimally transmitted information in listeners' neural and perceptual responses. CONCLUSIONS Translated to actual implant patients, our findings suggest that scalp-recorded ERPs might be useful in determining optimal signal processing settings from among a closed set of parameter options and aid in the objective fitting of CI devices.
Collapse
|
23
|
Rødvik AK, von Koss Torkildsen J, Wie OB, Storaker MA, Silvola JT. Consonant and Vowel Identification in Cochlear Implant Users Measured by Nonsense Words: A Systematic Review and Meta-Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:1023-1050. [PMID: 29623340 DOI: 10.1044/2018_jslhr-h-16-0463] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 12/18/2017] [Indexed: 06/08/2023]
Abstract
PURPOSE The purpose of this systematic review and meta-analysis was to establish a baseline of the vowel and consonant identification scores in prelingually and postlingually deaf users of multichannel cochlear implants (CIs) tested with consonant-vowel-consonant and vowel-consonant-vowel nonsense syllables. METHOD Six electronic databases were searched for peer-reviewed articles reporting consonant and vowel identification scores in CI users measured by nonsense words. Relevant studies were independently assessed and screened by 2 reviewers. Consonant and vowel identification scores were presented in forest plots and compared between studies in a meta-analysis. RESULTS Forty-seven articles with 50 studies, including 647 participants, thereof 581 postlingually deaf and 66 prelingually deaf, met the inclusion criteria of this study. The mean performance on vowel identification tasks for the postlingually deaf CI users was 76.8% (N = 5), which was higher than the mean performance for the prelingually deaf CI users (67.7%; N = 1). The mean performance on consonant identification tasks for the postlingually deaf CI users was higher (58.4%; N = 44) than for the prelingually deaf CI users (46.7%; N = 6). The most common consonant confusions were found between those with same manner of articulation (/k/ as /t/, /m/ as /n/, and /p/ as /t/). CONCLUSIONS The mean performance on consonant identification tasks for the prelingually and postlingually deaf CI users was found. There were no statistically significant differences between the scores for prelingually and postlingually deaf CI users. The consonants that were incorrectly identified were typically confused with other consonants with the same acoustic properties, namely, voicing, duration, nasality, and silent gaps. A univariate metaregression model, although not statistically significant, indicated that duration of implant use in postlingually deaf adults predict a substantial portion of their consonant identification ability. As there is no ceiling effect, a nonsense syllable identification test may be a useful addition to the standard test battery in audiology clinics when assessing the speech perception of CI users.
Collapse
Affiliation(s)
- Arne Kirkhorn Rødvik
- Department of Special Needs Education, Faculty of Educational Sciences, University of Oslo, Norway
| | | | - Ona Bø Wie
- Department of Special Needs Education, Faculty of Educational Sciences, University of Oslo, Norway
- Oslo University Hospital, Norway
| | - Marit Aarvaag Storaker
- Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Norway
- Lillehammer Hospital, Norway
| | - Juha Tapio Silvola
- Oslo University Hospital, Norway
- Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Norway
- Akershus University Hospital, Lørenskog, Norway
| |
Collapse
|
24
|
Reidy PF, Kristensen K, Winn MB, Litovsky RY, Edwards JR. The Acoustics of Word-Initial Fricatives and Their Effect on Word-Level Intelligibility in Children With Bilateral Cochlear Implants. Ear Hear 2018; 38:42-56. [PMID: 27556521 PMCID: PMC5161607 DOI: 10.1097/aud.0000000000000349] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Previous research has found that relative to their peers with normal hearing (NH), children with cochlear implants (CIs) produce the sibilant fricatives /s/ and /∫/ less accurately and with less subphonemic acoustic contrast. The present study sought to further investigate these differences across groups in two ways. First, subphonemic acoustic properties were investigated in terms of dynamic acoustic features that indexed more than just the contrast between /s/ and /∫/. Second, the authors investigated whether such differences in subphonemic acoustic contrast between sibilant fricatives affected the intelligibility of sibilant-initial single word productions by children with CIs and their peers with NH. DESIGN In experiment 1, productions of /s/ and /∫/ in word-initial prevocalic contexts were elicited from 22 children with bilateral CIs (aged 4 to 7 years) who had at least 2 years of CI experience and from 22 chronological age-matched peers with NH. Acoustic features were measured from 17 points across the fricatives: peak frequency was measured to index the place of articulation contrast; spectral variance and amplitude drop were measured to index the degree of sibilance. These acoustic trajectories were fitted with growth-curve models to analyze time-varying spectral change. In experiment 2, phonemically accurate word productions that were elicited in experiment 1 were embedded within four-talker babble and played to 80 adult listeners with NH. Listeners were asked to repeat the words, and their accuracy rate was used as a measure of the intelligibility of the word productions. Regression analyses were run to test which acoustic properties measured in experiment 1 predicted the intelligibility scores from experiment 2. RESULTS The peak frequency trajectories indicated that the children with CIs produced less acoustic contrast between /s/ and /∫/. Group differences were observed in terms of the dynamic aspects (i.e., the trajectory shapes) of the acoustic properties. In the productions by children with CIs, the peak frequency and the amplitude drop trajectories were shallower, and the spectral variance trajectories were more asymmetric, exhibiting greater increases in variance (i.e., reduced sibilance) near the fricative-vowel boundary. The listeners' responses to the word productions indicated that when produced by children with CIs, /∫/-initial words were significantly more intelligible than /s/-initial words. However, when produced by children with NH, /s/-initial words and /∫/-initial words were equally intelligible. Intelligibility was partially predicted from the acoustic properties (Cox & Snell pseudo-R > 0.190), and the significant predictors were predominantly dynamic, rather than static, ones. CONCLUSIONS Productions from children with CIs differed from those produced by age-matched NH controls in terms of their subphonemic acoustic properties. The intelligibility of sibilant-initial single-word productions by children with CIs is sensitive to the place of articulation of the initial consonant (/∫/-initial words were more intelligible than /s/-initial words), but productions by children with NH were equally intelligible across both places of articulation. Therefore, children with CIs still exhibit differential production abilities for sibilant fricatives at an age when their NH peers do not.
Collapse
Affiliation(s)
- Patrick F. Reidy
- Waisman Center, University of Wisconsin—Madison, Madison, Wisconsin, USA
- Department of Communication Sciences and Disorders, University of Wisconsin—Madison, Madison, Wisconsin, USA
| | - Kayla Kristensen
- Waisman Center, University of Wisconsin—Madison, Madison, Wisconsin, USA
- Department of Communication Sciences and Disorders, University of Wisconsin—Madison, Madison, Wisconsin, USA
| | - Matthew B. Winn
- Waisman Center, University of Wisconsin—Madison, Madison, Wisconsin, USA
- Department of Surgery, University of Wisconsin—Madison, Madison, Wisconsin, USA
- Department of Speech & Hearing Sciences, University of Washington, Seattle, Washington, USA
| | - Ruth Y. Litovsky
- Waisman Center, University of Wisconsin—Madison, Madison, Wisconsin, USA
- Department of Communication Sciences and Disorders, University of Wisconsin—Madison, Madison, Wisconsin, USA
- Department of Surgery, University of Wisconsin—Madison, Madison, Wisconsin, USA
| | - Jan R. Edwards
- Waisman Center, University of Wisconsin—Madison, Madison, Wisconsin, USA
- Department of Communication Sciences and Disorders, University of Wisconsin—Madison, Madison, Wisconsin, USA
| |
Collapse
|
25
|
Grieco-Calub TM, Simeon KM, Snyder HE, Lew-Williams C. Word segmentation from noise-band vocoded speech. LANGUAGE, COGNITION AND NEUROSCIENCE 2017; 32:1344-1356. [PMID: 29977950 PMCID: PMC6028043 DOI: 10.1080/23273798.2017.1354129] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 07/02/2017] [Indexed: 06/01/2023]
Abstract
Spectral degradation reduces access to the acoustics of spoken language and compromises how learners break into its structure. We hypothesised that spectral degradation disrupts word segmentation, but that listeners can exploit other cues to restore detection of words. Normal-hearing adults were familiarised to artificial speech that was unprocessed or spectrally degraded by noise-band vocoding into 16 or 8 spectral channels. The monotonic speech stream was pause-free (Experiment 1), interspersed with isolated words (Experiment 2), or slowed by 33% (Experiment 3). Participants were tested on segmentation of familiar vs. novel syllable sequences and on recognition of individual syllables. As expected, vocoding hindered both word segmentation and syllable recognition. The addition of isolated words, but not slowed speech, improved segmentation. We conclude that syllable recognition is necessary but not sufficient for successful word segmentation, and that isolated words can facilitate listeners' access to the structure of acoustically degraded speech.
Collapse
Affiliation(s)
- Tina M. Grieco-Calub
- The Roxelyn & Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL, USA
| | - Katherine M. Simeon
- The Roxelyn & Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL, USA
| | - Hillary E. Snyder
- The Roxelyn & Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL, USA
| | | |
Collapse
|
26
|
Abstract
BACKGROUND Computer-based auditory training programmes seem to be a useful tool in the process of auditory rehabilitation after cochlear implantation (CI). Currently, little is known about the learning mechanism and efficiency of such programs. The aim of the study was to evaluate a specific auditory training programme for phoneme discrimination in experienced CI listeners. MATERIALS AND METHODS A total of 15 CI adult listeners with more than 2 years' CI experience participated in the auditory training. Over a period of 3 weeks they were instructed to train their phoneme discrimination via computer twice a week. Training material consisted of special syllables for consonants (vCv) and vowels (cVc) discrimination. RESULTS The discrimination abilities for consonants and vowels improved significantly over the training period for training group participants, whereas the changes for the consonants were higher. In addition, the improvement for voiced and unvoiced consonants was significant. CONCLUSION Computerised auditory training with phonemes improves CI listeners' discrimination abilities for consonants and vowels.
Collapse
|
27
|
Jaekel BN, Newman RS, Goupell MJ. Speech Rate Normalization and Phonemic Boundary Perception in Cochlear-Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:1398-1416. [PMID: 28395319 PMCID: PMC5580678 DOI: 10.1044/2016_jslhr-h-15-0427] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 05/04/2016] [Accepted: 10/14/2016] [Indexed: 05/29/2023]
Abstract
PURPOSE Normal-hearing (NH) listeners rate normalize, temporarily remapping phonemic category boundaries to account for a talker's speech rate. It is unknown if adults who use auditory prostheses called cochlear implants (CI) can rate normalize, as CIs transmit degraded speech signals to the auditory nerve. Ineffective adjustment to rate information could explain some of the variability in this population's speech perception outcomes. METHOD Phonemes with manipulated voice-onset-time (VOT) durations were embedded in sentences with different speech rates. Twenty-three CI and 29 NH participants performed a phoneme identification task. NH participants heard the same unprocessed stimuli as the CI participants or stimuli degraded by a sine vocoder, simulating aspects of CI processing. RESULTS CI participants showed larger rate normalization effects (6.6 ms) than the NH participants (3.7 ms) and had shallower (less reliable) category boundary slopes. NH participants showed similarly shallow slopes when presented acoustically degraded vocoded signals, but an equal or smaller rate effect in response to reductions in available spectral and temporal information. CONCLUSION CI participants can rate normalize, despite their degraded speech input, and show a larger rate effect compared to NH participants. CI participants may particularly rely on rate normalization to better maintain perceptual constancy of the speech signal.
Collapse
Affiliation(s)
- Brittany N. Jaekel
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Rochelle S. Newman
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Matthew J. Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| |
Collapse
|
28
|
Yang J, Vadlamudi J, Yin Z, Lee CY, Xu L. Production of word-initial fricatives of Mandarin Chinese in prelingually deafened children with cochlear implants. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2017; 19:153-164. [PMID: 27063694 DOI: 10.3109/17549507.2016.1143972] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2015] [Accepted: 01/15/2016] [Indexed: 06/05/2023]
Abstract
PURPOSE This study examined the production of fricatives by prelingually deafened Mandarin-speaking children with cochlear implants (CIs). METHOD Fourteen cochlear implant (CI) children (2.9-8.3 years old) and 60 age-matched normal-hearing (NH) children were recorded producing a list of 13 Mandarin words with four fricatives, /f, s, ɕ, ʂ/, occurring at the syllable-initial position evoked with a picture-naming task. Two phonetically-trained native Mandarin speakers transcribed the fricative productions. Acoustic analysis was conducted to examine acoustic measures including duration, normalised amplitude, spectral peak location and four spectral moments. RESULT The CI children showed much lower accuracy rates and more diverse error patterns on all four fricatives than their NH peers. Among these four fricatives, both CI and NH children showed the highest rate of mispronunciation of /s/. The acoustic results showed that the speech of the CI children differed from the NH children in spectral peak location, normalised amplitude, spectral mean and spectral skewness. In addition, the fricatives produced by the CI children showed less distinctive patterns of acoustic measures relative to the NH children. CONCLUSION In general, these results indicate that the CI children have not established distinct categories for the Mandarin fricatives in terms of the place of articulation.
Collapse
Affiliation(s)
- Jing Yang
- a Communication Sciences and Disorders, Speech Language and Hearing Center, University of Central Arkansas , Conway , AR , USA
| | - Jessica Vadlamudi
- b Communication Sciences and Disorders, Ohio University , Athens , OH , USA , and
| | - Zhigang Yin
- c Institute of Linguistics, Chinese Academy of Social Sciences , Beijing , PR China
| | - Chao-Yang Lee
- b Communication Sciences and Disorders, Ohio University , Athens , OH , USA , and
| | - Li Xu
- b Communication Sciences and Disorders, Ohio University , Athens , OH , USA , and
| |
Collapse
|
29
|
Abstract
OBJECTIVES This study measured the impact of auditory spectral resolution on listening effort. Systematic degradation in spectral resolution was hypothesized to elicit corresponding systematic increases in pupil dilation, consistent with the notion of pupil dilation as a marker of cognitive load. DESIGN Spectral resolution of sentences was varied with two different vocoders: (1) a noise-channel vocoder with a variable number of spectral channels; and (2) a vocoder designed to simulate front-end processing of a cochlear implant, including peak-picking channel selection with variable synthesis filter slopes to simulate spread of neural excitation. Pupil dilation was measured after subject-specific luminance adjustment and trial-specific baseline measures. Mixed-effects growth curve analysis was used to model pupillary responses over time. RESULTS For both types of vocoder, pupil dilation grew with each successive degradation in spectral resolution. Within each condition, pupillary responses were not related to intelligibility scores, and the effect of spectral resolution on pupil dilation persisted even when only analyzing trials in which responses were 100% correct. CONCLUSIONS Intelligibility scores alone were not sufficient to quantify the effort required to understand speech with poor resolution. Degraded spectral resolution results in increased effort required to understand speech, even when intelligibility is at 100%. Pupillary responses were a sensitive and highly granular measurement to reveal changes in listening effort. Pupillary responses might potentially reveal the benefits of aural prostheses that are not captured by speech intelligibility performance alone as well as the disadvantages that are overcome by increased listening effort.
Collapse
|
30
|
Winn MB, Litovsky RY. Using speech sounds to test functional spectral resolution in listeners with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:1430-1442. [PMID: 25786954 PMCID: PMC4368591 DOI: 10.1121/1.4908308] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Revised: 11/26/2014] [Accepted: 01/20/2015] [Indexed: 05/31/2023]
Abstract
In this study, spectral properties of speech sounds were used to test functional spectral resolution in people who use cochlear implants (CIs). Specifically, perception of the /ba/-/da/ contrast was tested using two spectral cues: Formant transitions (a fine-resolution cue) and spectral tilt (a coarse-resolution cue). Higher weighting of the formant cues was used as an index of better spectral cue perception. Participants included 19 CI listeners and 10 listeners with normal hearing (NH), for whom spectral resolution was explicitly controlled using a noise vocoder with variable carrier filter widths to simulate electrical current spread. Perceptual weighting of the two cues was modeled with mixed-effects logistic regression, and was found to systematically vary with spectral resolution. The use of formant cues was greatest for NH listeners for unprocessed speech, and declined in the two vocoded conditions. Compared to NH listeners, CI listeners relied less on formant transitions, and more on spectral tilt. Cue-weighting results showed moderately good correspondence with word recognition scores. The current approach to testing functional spectral resolution uses auditory cues that are known to be important for speech categorization, and can thus potentially serve as the basis upon which CI processing strategies and innovations are tested.
Collapse
Affiliation(s)
- Matthew B Winn
- Waisman Center and Department of Surgery, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, Wisconsin 53705
| | - Ruth Y Litovsky
- Waisman Center, Department of Communication Sciences and Disorders and Department of Surgery, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, Wisconsin 53705
| |
Collapse
|
31
|
Carlyon RP, Monstrey J, Deeks JM, Macherey O. Evaluation of a cochlear-implant processing strategy incorporating phantom stimulation and asymmetric pulses. Int J Audiol 2014; 53:871-9. [PMID: 25358027 PMCID: PMC4266076 DOI: 10.3109/14992027.2014.932024] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE To evaluate a speech-processing strategy in which the lowest frequency channel is conveyed using an asymmetric pulse shape and "phantom stimulation", where current is injected into one intra-cochlear electrode and where the return current is shared between an intra-cochlear and an extra-cochlear electrode. This strategy is expected to provide more selective excitation of the cochlear apex, compared to a standard strategy where the lowest-frequency channel is conveyed by symmetric pulses in monopolar mode. In both strategies all other channels were conveyed by monopolar stimulation. DESIGN Within-subjects comparison between the two strategies. Four experiments: (1) discrimination between the strategies, controlling for loudness differences, (2) consonant identification, (3) recognition of lowpass-filtered sentences in quiet, (4) sentence recognition in the presence of a competing speaker. STUDY SAMPLE Eight users of the Advanced Bionics CII/Hi-Res 90k cochlear implant. RESULTS Listeners could easily discriminate between the two strategies but no consistent differences in performance were observed. CONCLUSIONS The proposed method does not improve speech perception, at least in the short term.
Collapse
|
32
|
Warner-Czyz AD, Houston DM, Hynan LS. Vowel discrimination by hearing infants as a function of number of spectral channels. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:3017-24. [PMID: 24815281 PMCID: PMC4109213 DOI: 10.1121/1.4870700] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2012] [Revised: 03/26/2014] [Accepted: 03/27/2014] [Indexed: 05/15/2023]
Abstract
Reduced spectral resolution negatively impacts speech perception, particularly perception of vowels and consonant place. This study assessed impact of number of spectral channels on vowel discrimination by 6-month-old infants with normal hearing by comparing three listening conditions: Unprocessed speech, 32 channels, and 16 channels. Auditory stimuli (/ti/ and /ta/) were spectrally reduced using a noiseband vocoder and presented to infants with normal hearing via visual habituation. Results supported a significant effect of number of channels on vowel discrimination by 6-month-old infants. No differences emerged between unprocessed and 32-channel conditions in which infants looked longer during novel stimulus trials (i.e., discrimination). The 16-channel condition yielded a significantly different pattern: Infants demonstrated no significant difference in looking time to familiar vs novel stimulus trials, suggesting infants cannot discriminate /ti/ and /ta/ with only 16 channels. Results support effects of spectral resolution on vowel discrimination. Relative to published reports, young infants need more spectral detail than older children and adults to perceive spectrally degraded speech. Results have implications for development of perception by infants with hearing loss who receive auditory prostheses.
Collapse
Affiliation(s)
- Andrea D Warner-Czyz
- Department of Communication Sciences and Disorders, The University of Texas at Dallas, Callier Advanced Hearing Research Center, 1966 Inwood Road, Dallas, Texas 75235
| | - Derek M Houston
- Department of Otolaryngology, Head and Neck Surgery, Indiana University School of Medicine, 699 Riley Hospital Drive/RR044, Indianapolis, Indiana 46202
| | - Linda S Hynan
- Departments of Clinical Sciences and Psychiatry, The University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, Texas 75390
| |
Collapse
|
33
|
Van Zyl M, Hanekom JJ. Perception of vowels and prosody by cochlear implant recipients in noise. JOURNAL OF COMMUNICATION DISORDERS 2013; 46:449-464. [PMID: 24157128 DOI: 10.1016/j.jcomdis.2013.09.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2012] [Revised: 09/13/2013] [Accepted: 09/16/2013] [Indexed: 06/02/2023]
Abstract
UNLABELLED The aim of the present study was to compare the ability of cochlear implant (CI) recipients to recognise speech prosody in the presence of speech-weighted noise to their ability to recognise vowels in the same test paradigm and listening condition. All test materials were recorded from four different speakers (two male, two female). Two prosody recognition tasks were developed, both using single words as stimuli. The first task involved a question/statement distinction, while the second task required listeners to make a judgement about the speaker's attitude. Vowel recognition tests were conducted using vowel pairs selected on the basis of specific acoustic cues (frequencies of the first two formants and duration). Ten CI users and ten normal-hearing controls were tested in both quiet and an adaptive noise condition, using a two-alternative forced-choice test paradigm for all the tests. Results indicated that vowel recognition was significantly better than prosody recognition in both listener groups in both quiet and noise, and that question/statement discrimination was the most difficult task for CI listeners in noise. Data from acoustic analyses were used to interpret differences in performance on different tasks and with different speakers. LEARNING OUTCOMES As a result of this activity, readers will be able to (1) describe suitable methods for comparing vowel and prosody perception in noise, (2) compare performance on vowel and prosody perception tasks in quiet in normal-hearing listeners and cochlear implant recipients, (3) compare performance on vowel and prosody perception tasks in noise in normal-hearing listeners and cochlear implant recipients and (4) relate performance on prosody tasks in quiet to performance on these tasks in noise.
Collapse
Affiliation(s)
- Marianne Van Zyl
- Department of Electrical, Electronic and Computer Engineering, University of Pretoria, Lynnwood Road, Pretoria 0002, South Africa
| | | |
Collapse
|
34
|
Todd AE, Edwards JR, Litovsky RY. Production of contrast between sibilant fricatives by children with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:3969-3979. [PMID: 22225051 PMCID: PMC3253598 DOI: 10.1121/1.3652852] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2010] [Revised: 09/13/2011] [Accepted: 09/16/2011] [Indexed: 05/28/2023]
Abstract
Speech production by children with cochlear implants (CIs) is generally less intelligible and less accurate on a phonemic level than that of normally hearing children. Research has reported that children with CIs produce less acoustic contrast between phonemes than normally hearing children, but these studies have included correct and incorrect productions. The present study compared the extent of contrast between correct productions of /s/ and /∫/ by children with CIs and two comparison groups: (1) normally hearing children of the same chronological age as the children with CIs and (2) normally hearing children with the same duration of auditory experience. Spectral peaks and means were calculated from the frication noise of productions of /s/ and /∫/. Results showed that the children with CIs produced less contrast between /s/ and /∫/ than normally hearing children of the same chronological age and normally hearing children with the same duration of auditory experience due to production of /s/ with spectral peaks and means at lower frequencies. The results indicate that there may be differences between the speech sounds produced by children with CIs and their normally hearing peers even for sounds that adults judge as correct.
Collapse
Affiliation(s)
- Ann E Todd
- University of Wisconsin Waisman Center, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| | | | | |
Collapse
|
35
|
Välimaa TT, Sorri MJ, Laitakari J, Sivonen V, Muhli A. Vowel confusion patterns in adults during initial 4 years of implant use. CLINICAL LINGUISTICS & PHONETICS 2011; 25:121-144. [PMID: 21070135 DOI: 10.3109/02699206.2010.514692] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
This study investigated adult cochlear implant users' (n = 39) vowel recognition and confusions by an open-set syllable test during 4 years of implant use, in a prospective repeated-measures design. Subjects' responses were coded for phoneme errors and estimated by the generalized mixed model. Improvement in overall vowel recognition was highest during the first 6 months, showing statistically significant change until 4 years, especially for the mediocre performers. The best performers improved statistically significantly until 18 months. The poorest performers improved until 12 months and exhibited more vowel confusions. No differences were found in overall vowel recognition between Nucleus24M/24R and Med-ElC40+ device users (matched comparison), but certain vowels showed statistically significant differences. Vowel confusions between adjacent vowels were evident, probably due to the implant users' inability to discriminate formant frequencies. Vowel confusions were also dominated by vowels whose average F1 and/or F2 frequencies were higher than the target vowel, indicating a basalward shift in the confusions.
Collapse
Affiliation(s)
- Taina T Välimaa
- Faculty of Humanities, Logopedics, and Department of Otorhinolaryngology. Oulu University Hospital, University of Oulu, Finland.
| | | | | | | | | |
Collapse
|
36
|
Verschuur C. Modeling the effect of channel number and interaction on consonant recognition in a cochlear implant peak-picking strategy. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:1723-1736. [PMID: 19275329 DOI: 10.1121/1.3075554] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Difficulties in speech recognition experienced by cochlear implant users may be attributed both to information loss caused by signal processing and to information loss associated with the interface between the electrode array and auditory nervous system, including cross-channel interaction. The objective of the work reported here was to attempt to partial out the relative contribution of these different factors to consonant recognition. This was achieved by comparing patterns of consonant feature recognition as a function of channel number and presence/absence of background noise in users of the Nucleus 24 device with normal hearing subjects listening to acoustic models that mimicked processing of that device. Additionally, in the acoustic model experiment, a simulation of cross-channel spread of excitation, or "channel interaction," was varied. Results showed that acoustic model experiments were highly correlated with patterns of performance in better-performing cochlear implant users. Deficits to consonant recognition in this subgroup could be attributed to cochlear implant processing, whereas channel interaction played a much smaller role in determining performance errors. The study also showed that large changes to channel number in the Advanced Combination Encoder signal processing strategy led to no substantial changes in performance.
Collapse
Affiliation(s)
- Carl Verschuur
- Hearing and Balance Centre, Institute of Sound and Vibration Research, University of Southampton, Highfield, Southampton, United Kingdom
| |
Collapse
|
37
|
The psychoacoustics of noise vocoded speech: A physiological means to a perceptual end. Hear Res 2008; 241:87-96. [PMID: 18556159 DOI: 10.1016/j.heares.2008.05.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2007] [Revised: 04/29/2008] [Accepted: 05/06/2008] [Indexed: 10/22/2022]
|
38
|
Rødvik AK. Perception and confusion of speech sounds by adults with a cochlear implant. CLINICAL LINGUISTICS & PHONETICS 2008; 22:371-378. [PMID: 18415737 DOI: 10.1080/02699200801919299] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The aim of this pilot study was to identify the most common speech sound confusions of 5 Norwegian cochlear implanted post-lingually deafened adults. We played recorded nonwords, aCa, iCi and bVb, to our informants, asked them to repeat what they heard, recorded their repetitions and transcribed these phonetically. We arranged the collected data in confusion matrices to find the most common and most uncommon speech sound confusions. We found that the voiced and unvoiced consonants are seldom confused. We also found that there was a higher rate of consonant confusion for the iCi words than for the aCa words. The most frequent confusion was [eta] perceived as [n], [m] perceived as [n] and [upsilon] perceived as [n]. For the consonants, manner of articulation was rarely confused, but place of articulation was often confused. An exception from this was the confusion of [l] and [n], which differs only in manner of articulation. The latter is in accordance with reports we get from clinicians. We postulate that this is caused by the speech processing of the cochlear implant. We found less confusion of the vowels, which can be explained by the fact that vowels have much higher energy and longer duration than most of the consonants. The most frequent confusion was [a:] perceived as [see text] and [u:] perceived as [see text]. [e:], [i:] and [see text] were never confused with other vowels.
Collapse
Affiliation(s)
- Arne K Rødvik
- Department of Otolaryngology, Rikshospitalet University Hospital, Oslo, Norway.
| |
Collapse
|
39
|
Desai S, Stickney G, Zeng FG. Auditory-visual speech perception in normal-hearing and cochlear-implant listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:428-440. [PMID: 18177171 PMCID: PMC2662523 DOI: 10.1121/1.2816573] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
The present study evaluated auditory-visual speech perception in cochlear-implant users as well as normal-hearing and simulated-implant controls to delineate relative contributions of sensory experience and cues. Auditory-only, visual-only, or auditory-visual speech perception was examined in the context of categorical perception, in which an animated face mouthing ba, da, or ga was paired with synthesized phonemes from an 11-token auditory continuum. A three-alternative, forced-choice method was used to yield percent identification scores. Normal-hearing listeners showed sharp phoneme boundaries and strong reliance on the auditory cue, whereas actual and simulated implant listeners showed much weaker categorical perception but stronger dependence on the visual cue. The implant users were able to integrate both congruent and incongruent acoustic and optical cues to derive relatively weak but significant auditory-visual integration. This auditory-visual integration was correlated with the duration of the implant experience but not the duration of deafness. Compared with the actual implant performance, acoustic simulations of the cochlear implant could predict the auditory-only performance but not the auditory-visual integration. These results suggest that both altered sensory experience and improvised acoustic cues contribute to the auditory-visual speech perception in cochlear-implant users.
Collapse
Affiliation(s)
| | | | - Fan-Gang Zeng
- Send Correspondence to: Fan-Gang Zeng University of California, Irvine 364 Med Surge II Irvine, CA 92697 Phone: (949) 824-1539 FAX: (949) 824-5907 E-mail:
| |
Collapse
|
40
|
Donaldson GS, Kreft HA. Effects of Vowel Context on the Recognition of Initial and Medial Consonants by Cochlear Implant Users. Ear Hear 2006; 27:658-77. [PMID: 17086077 DOI: 10.1097/01.aud.0000240543.31567.54] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Scores on consonant-recognition tests are widely used as an index of speech-perception ability in cochlear implant (CI) users. The consonant stimuli in these tests are typically presented in the /alpha/ vowel context, even though consonants in conversational speech occur in many other contexts. For this reason, it would be useful to know whether vowel context has any systematic effect on consonant recognition in this population. The purpose of the present study was to compare consonant recognition for the /alpha, i/, and /u/ vowel contexts for consonants presented in both initial (Cv) and medial (vCv) positions. DESIGN Twenty adult CI users with one of three different implanted devices underwent consonant-confusion testing. Twelve stimulus conditions that differed according to vowel context (/alpha, i, u/), consonant position (Cv, vCv), and talker gender (male, female) were assessed in each subject. RESULTS Mean percent-correct consonant-recognition scores were slightly (5 to 8%) higher for the /alpha/ and /u/ vowel contexts than for the /i/ vowel context for both initial and medial consonants. This general pattern was observed for both male and female talkers, for subjects with better and poorer average consonant-recognition performance, and for subjects using low, medium, and high stimulation rates in their speech processors. In contrast to the mean data, many individual subjects demonstrated large effects of vowel context. For 10 of 20 subjects, consonant-recognition scores varied by 15% or more across vowel contexts in one or more stimulus conditions. Similar to the mean data, these differences generally reflected better performance for the /alpha/ and /u/ vowel contexts than for the /i/ vowel context. An analysis of consonant features showed that overall performance was best for the voicing feature, followed by the manner and place features, and that the place feature showed the strongest effect of vowel context. Vowel-context effects were strongest for the six consonants /d, j, n, k, m/, and /l/. For three of these consonants (/j, n, k/), the back vowels /alpha/ and /u/ produced substantially (30 to 35%) higher mean scores than the front vowel /i/. For each of the remaining three consonants, a unique pattern was observed in which a different single vowel produced substantially higher scores than the others. Several additional consonants (/s, g, w, b/, and /d/) showed strong context effects in either the initial consonant or medial consonant position. Overall, voiceless stop, nasal, and glide-liquid consonants showed the strongest effects of vowel context, whereas the voiceless fricative and voiceless affricate consonants were least affected. Consistent with the feature analysis, a qualitative assessment of phoneme errors for the six key consonants indicated that vowel-context effects stem primarily from changes in the number of place-of-articulation errors made in each context. CONCLUSIONS Vowel context has small but significant effects on consonant-recognition scores for the "average" CI listener, with the back vowels /alpha/ and /u/ producing better performance than the front vowel /i/. In contrast to the average results, however, the effects of vowel context are sizable in some individual subjects. This suggests that it may be beneficial to assess consonant recognition using two vowels, such as /alpha/ and /i/, which produce better and poorer performance, respectively. The present results underscore previous findings that poor transmission of spectral speech cues limits consonant-recognition performance in CI users. Spectral cue transmission may be hindered not only by poor spectral resolution in these listeners but also by the brief duration and dynamic nature of consonant place-of-articulation cues.
Collapse
Affiliation(s)
- Gail S Donaldson
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA.
| | | |
Collapse
|
41
|
Füllgrabe C, Berthommier F, Lorenzi C. Masking release for consonant features in temporally fluctuating background noise. Hear Res 2005; 211:74-84. [PMID: 16289579 DOI: 10.1016/j.heares.2005.09.001] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/05/2005] [Revised: 09/05/2005] [Accepted: 09/14/2005] [Indexed: 10/25/2022]
Abstract
Consonant identification was measured for normal-hearing listeners using Vowel-Consonant-Vowel stimuli that were either unprocessed or spectrally degraded to force listeners to use temporal-envelope cues. Stimuli were embedded in a steady state or fluctuating noise masker and presented at a fixed signal-to-noise ratio. Fluctuations in the maskers were obtained by applying sinusoidal modulation to: (i) the amplitude of the noise (1st-order SAM masker) or (ii) the modulation depth of a 1st-order SAM noise (2nd-order SAM masker). The frequencies of the amplitude variation fm and the depth variation f'm were systematically varied. Consistent with previous studies, identification scores obtained with unprocessed speech were highest in an 8-Hz, 1st-order SAM masker. Reception of voicing and manner also peaked around fm=8 Hz, while the reception of place of articulation was maximal at a higher frequency (fm=32 Hz). When 2nd-order SAM maskers were used, identification scores and received information for each consonant feature were found to be independent of f'm. They decreased progressively with increasing carrier modulation frequency fm, and ranged between those obtained with the steady state and the 1st-order SAM maskers. Finally, the results obtained with spectrally degraded speech were similar across all types of maskers, although an 8% improvement in the reception of voicing was observed for modulated maskers with fm < 64 Hz compared to the steady-state masker. These data provide additional evidence that listeners take advantage of temporal minima in fluctuating background noises, and suggest that: (i) minima of different durations are required for an optimal reception of the three consonant features and (ii) complex (i.e., 2nd-order) envelope fluctuations in background noise do not degrade speech identification by interfering with speech-envelope processing.
Collapse
Affiliation(s)
- Christian Füllgrabe
- Laboratoire de Psychologie Expérimentale - UMR CNRS 8581, Institut de Psychologie, Université René Descartes - Paris 5, 71 Avenue Vaillant, 92774 Boulogne-Billancourt, France.
| | | | | |
Collapse
|
42
|
Munson B, Nelson PB. Phonetic identification in quiet and in noise by listeners with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:2607-17. [PMID: 16266181 DOI: 10.1121/1.2005887] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
This study examined the effect of noise on the identification of four synthetic speech continua (/ra/-/la/, /wa/-/ja/, /i/-/u/, and say-stay) by adults with cochlea implants (CIs) and adults with normal-hearing (NH) sensitivity in quiet and noise. Significant group-by-SNR interactions were found for endpoint identification accuracy for all continua except /i/-/u/. The CI listeners showed the least NH-like identification functions for the /ra/-/la/ and /wa/-/ja/ continua. In a second experiment, NH adults identified four- and eight-band cochlear implant stimulations of the four continua, to examine whether group differences in frequency selectivity could account for the group differences in the first experiment. Number of bands and SNR interacted significantly for /ra/-/la/, /wa/-/ja/, and say-stay endpoint identification; strongest effects were found for the /ra/-/la/ and say-stay continua. Results suggest that the speech features that are most vulnerable to misperception in noise by listeners with CIs are those whose acoustic cues are rapidly changing spectral patterns, like the formant transitions in the /wa/-/ja/ and /ra/-/la/ continua. However, the group differences in the first experiment cannot be wholly attributable to frequency selectivity differences, as the number of bands in the second experiment affected performance differently than suggested by group differences in the first experiment.
Collapse
Affiliation(s)
- Benjamin Munson
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis, Minnesota 55455, USA.
| | | |
Collapse
|
43
|
Abstract
OBJECTIVE The objective of this study was to measure the performance of persons with cochlear implants on a test of environmental-sound reception. DESIGN The reception of environmental sounds was studied using a test employing closed sets of 10 sounds in each of four different settings (General Home, Kitchen, Office, and Outside). The participants in the study were 11 subjects with cochlear implants. Identification testing was conducted under each of the four closed sets of stimuli using a one-interval, 10-alternative, forced-choice procedure. The data were summarized in terms of overall percent correct identification scores and information transfer (IT) in bits. Confusion patterns were described using a hierarchical-clustering analysis. In addition, individual performance on the environmental-sound task was related to the ability to recognize isolated words through the cochlear implant alone. RESULTS Levels of performance were similar across the four stimulus sets. Mean scores across subjects ranged from 45.3% correct (and IT of 1.5 bits) to 93.8% correct (and IT of 3.1 bits). Performance on the environmental-sound identification test was roughly related to NU-6 word recognition ability. Specifically, those subjects with word scores greater than 34% correct performed at levels of 80 to 94% on environmental-sound recognition, whereas subjects with word scores less than 34% had greater difficulty on the task. Results of the hierarchical clustering analysis, conducted on two groups of subjects (a high-performing [HP] group and a low-performing [LP] group), indicated that confusions were confined to three or four specific stimuli for the HP subjects and that larger clusters of confused stimuli were observed in the data of the LP group. Signals with distinct temporal-envelope characteristics were easily perceived by all subjects, and confused items tended to share similar overall durations and temporal envelopes. CONCLUSIONS Temporal-envelope cues appear to play a large role in the identification of environmental sounds through cochlear implants. The finer distinctions made by the HP group compared with the LP group may be related to a better ability both to resolve temporal differences and to use gross spectral cues. These findings are qualitatively consistent with patterns of confusions observed in the reception of speech segments through cochlear implants.
Collapse
Affiliation(s)
- Charlotte M Reed
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
44
|
Collison EA, Munson B, Carney AE. Relations among linguistic and cognitive skills and spoken word recognition in adults with cochlear implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2004; 47:496-508. [PMID: 15212564 DOI: 10.1044/1092-4388(2004/039)] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This study examined spoken word recognition in adults with cochlear implants (CIs) to determine the extent to which linguistic and cognitive abilities predict variability in speech-perception performance. Both a traditional consonant-vowel-consonant (CVC)-repetition measure and a gated-word recognition measure (F. Grosjean, 1996) were used. Stimuli in the gated-word-recognition task varied in neighborhood density. Adults with CIs repeated CVC words less accurately than did age-matched adults with normal hearing sensitivity (NH). In addition, adults with CIs required more acoustic information to recognize gated words than did adults with NH. Neighborhood density had a smaller influence on gated-word recognition by adults with CIs than on recognition by adults with NH. With the exception of 1 outlying participant, standardized, norm-referenced measures of cognitive and linguistic abilities were not correlated with word-recognition measures. Taken together, these results do not support the hypothesis that cognitive and linguistic abilities predict variability in speech-perception performance in a heterogeneous group of adults with CIs. Findings are discussed in light of the potential role of auditory perception in mediating relations among cognitive and linguistic skill and spoken word recognition.
Collapse
|