1
|
Amini AE, Naples JG, Cortina L, Hwa T, Morcos M, Castellanos I, Moberly AC. A Scoping Review and Meta-Analysis of the Relations Between Cognition and Cochlear Implant Outcomes and the Effect of Quiet Versus Noise Testing Conditions. Ear Hear 2024; 45:1339-1352. [PMID: 38953851 PMCID: PMC11493527 DOI: 10.1097/aud.0000000000001527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
OBJECTIVES Evidence continues to emerge of associations between cochlear implant (CI) outcomes and cognitive functions in postlingually deafened adults. While there are multiple factors that appear to affect these associations, the impact of speech recognition background testing conditions (i.e., in quiet versus noise) has not been systematically explored. The two aims of this study were to (1) identify associations between speech recognition following cochlear implantation and performance on cognitive tasks, and to (2) investigate the impact of speech testing in quiet versus noise on these associations. Ultimately, we want to understand the conditions that impact this complex relationship between CI outcomes and cognition. DESIGN A scoping review following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines was performed on published literature evaluating the relation between outcomes of cochlear implantation and cognition. The current review evaluates 39 papers that reported associations between over 30 cognitive assessments and speech recognition tests in adult patients with CIs. Six cognitive domains were evaluated: Global Cognition, Inhibition-Concentration, Memory and Learning, Controlled Fluency, Verbal Fluency, and Visuospatial Organization. Meta-analysis was conducted on three cognitive assessments among 12 studies to evaluate relations with speech recognition outcomes. Subgroup analyses were performed to identify whether speech recognition testing in quiet versus in background noise impacted its association with cognitive performance. RESULTS Significant associations between cognition and speech recognition in a background of quiet or noise were found in 69% of studies. Tests of Global Cognition and Inhibition-Concentration skills resulted in the highest overall frequency of significant associations with speech recognition (45% and 57%, respectively). Despite the modest proportion of significant associations reported, pooling effect sizes across samples through meta-analysis revealed a moderate positive correlation between tests of Global Cognition ( r = +0.37, p < 0.01) as well as Verbal Fluency ( r = +0.44, p < 0.01) and postoperative speech recognition skills. Tests of Memory and Learning are most frequently utilized in the setting of CI (in 26 of 39 included studies), yet meta-analysis revealed nonsignificant associations with speech recognition performance in a background of quiet ( r = +0.30, p = 0.18), and noise ( r = -0.06, p = 0.78). CONCLUSIONS Background conditions of speech recognition testing may influence the relation between speech recognition outcomes and cognition. The magnitude of this effect of testing conditions on this relationship appears to vary depending on the cognitive construct being assessed. Overall, Global Cognition and Inhibition-Concentration skills are potentially useful in explaining speech recognition skills following cochlear implantation. Future work should continue to evaluate these relations to appropriately unify cognitive testing opportunities in the setting of cochlear implantation.
Collapse
Affiliation(s)
- Andrew E Amini
- Department of Otolaryngology Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts, USA
- These authors contributed equally to this work
| | - James G Naples
- Division of Otolaryngology-Head and Neck Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
- These authors contributed equally to this work
| | - Luis Cortina
- Department of Otolaryngology Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts, USA
| | - Tiffany Hwa
- Division of Otology, Neurotology, & Lateral Skull Base Surgery, Department of Otolaryngology-Head and Neck Surgery, Hospital of the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Mary Morcos
- Department of Otolaryngology Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts, USA
| | - Irina Castellanos
- Department of Otolaryngology-Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
| | - Aaron C Moberly
- Department of Otolaryngology-Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
2
|
Zhang H, Xu L, Ma W, Han J, Wang Y, Ding H, Zhang Y. High variability phonetic training facilitates perception-to-production transfer in Mandarin-speaking children with cochlear implants: An acoustic investigation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:2299-2314. [PMID: 39382338 DOI: 10.1121/10.0030466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 09/17/2024] [Indexed: 10/10/2024]
Abstract
This study primarily aimed to evaluate the effectiveness of high variability phonetic training (HVPT) for children with cochlear implants (CIs) via the cross-modal transfer of perceptual learning to lexical tone production, a scope that has been largely neglected by previous training research. Sixteen CI participants received a five-session HVPT within a period of three weeks, whereas another 16 CI children were recruited without receiving any formal training. Lexical tone production was assessed with a picture naming task before the provision (pretest) and immediately after (posttest) and ten weeks after (follow-up test) the completion of the training protocol. The production samples were coded and analyzed acoustically. Despite considerable distinctions from the typical baselines of normal-hearing peers, the trained CI children exhibited significant improvements in Mandarin tone production from pretest to posttest in pitch height of T1, pitch slope of T2, and pitch curvature of T3. Moreover, the training-induced acoustic changes in the concave characteristic of the T3 contour was retained ten weeks after training termination. This study represents an initial acoustic investigation on HVPT-induced benefits in lexical tone production for the pediatric CI population, which provides valuable insights into applying this perceptual training technique as a viable tool in clinical practices.
Collapse
Affiliation(s)
- Hao Zhang
- School of Foreign Languages and Literature, Shandong University, Jinan, Shandong 250100, China
| | - Lele Xu
- School of Foreign Languages and Literature, Shandong University, Jinan, Shandong 250100, China
| | - Wen Ma
- School of Foreign Languages and Literature, Shandong University, Jinan, Shandong 250100, China
| | - Junning Han
- Hearing and Speech Rehabilitation Center, Zibo Maternal and Child Health Hospital, Zibo, Shandong 255000, China
| | - Yanxiang Wang
- Hearing and Speech Rehabilitation Center, Zibo Maternal and Child Health Hospital, Zibo, Shandong 255000, China
| | - Hongwei Ding
- School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
- Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, Minnesota 55414, USA
| |
Collapse
|
3
|
Hoarau C, Pralus A, Moulin A, Bedoin N, Ginzburg J, Fornoni L, Aguera PE, Tillmann B, Caclin A. Deficits in congenital amusia: Pitch, music, speech, and beyond. Neuropsychologia 2024; 202:108960. [PMID: 39032629 DOI: 10.1016/j.neuropsychologia.2024.108960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 07/17/2024] [Accepted: 07/17/2024] [Indexed: 07/23/2024]
Abstract
Congenital amusia is a neurodevelopmental disorder characterized by deficits of music perception and production, which are related to altered pitch processing. The present study used a wide variety of tasks to test potential patterns of processing impairment in individuals with congenital amusia (N = 18) in comparison to matched controls (N = 19), notably classical pitch processing tests (i.e., pitch change detection, pitch direction of change identification, and pitch short-term memory tasks) together with tasks assessing other aspects of pitch-related auditory cognition, such as emotion recognition in speech, sound segregation in tone sequences, and speech-in-noise perception. Additional behavioral measures were also collected, including text reading/copying tests, visual control tasks, and a subjective assessment of hearing abilities. As expected, amusics' performance was impaired for the three pitch-specific tasks compared to controls. This deficit of pitch perception had a self-perceived impact on amusics' quality of hearing. Moreover, participants with amusia were impaired in emotion recognition in vowels compared to controls, but no group difference was observed for emotion recognition in sentences, replicating previous data. Despite pitch processing deficits, participants with amusia did not differ from controls in sound segregation and speech-in-noise perception. Text reading and visual control tests did not reveal any impairments in participants with amusia compared to controls. However, the copying test revealed more numerous eye-movements and a smaller memory span. These results allow us to refine the pattern of pitch processing and memory deficits in congenital amusia, thus contributing further to understand pitch-related auditory cognition. Together with previous reports suggesting a comorbidity between congenital amusia and dyslexia, the findings call for further investigation of language-related abilities in this disorder even in the absence of neurodevelopmental language disorder diagnosis.
Collapse
Affiliation(s)
- Caliani Hoarau
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Humans Matter, Lyon, France.
| | - Agathe Pralus
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Humans Matter, Lyon, France
| | - Annie Moulin
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Nathalie Bedoin
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Université Lumière Lyon 2, Lyon, France
| | - Jérémie Ginzburg
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Lesly Fornoni
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Pierre-Emmanuel Aguera
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Barbara Tillmann
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Laboratory for Research on Learning and Development, Université de Bourgogne, LEAD-CNRS UMR5022, Dijon, France
| | - Anne Caclin
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| |
Collapse
|
4
|
Zhang Y, Johannesen PT, Molaee-Ardekani B, Wijetillake A, Attili Chiea R, Hasan PY, Segovia-Martínez M, Lopez-Poveda EA. Comparison of Performance for Cochlear-Implant Listeners Using Audio Processing Strategies Based on Short-Time Fast Fourier Transform or Spectral Feature Extraction. Ear Hear 2024:00003446-990000000-00339. [PMID: 39288360 DOI: 10.1097/aud.0000000000001565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/19/2024]
Abstract
OBJECTIVES We compared sound quality and performance for a conventional cochlear-implant (CI) audio processing strategy based on short-time fast-Fourier transform (Crystalis) and an experimental strategy based on spectral feature extraction (SFE). In the latter, the more salient spectral features (acoustic events) were extracted and mapped into the CI stimulation electrodes. We hypothesized that (1) SFE would be superior to Crystalis because it can encode acoustic spectral features without the constraints imposed by the short-time fast-Fourier transform bin width, and (2) the potential benefit of SFE would be greater for CI users who have less neural cross-channel interactions. DESIGN To examine the first hypothesis, 6 users of Oticon Medical Digisonic SP CIs were tested in a double-blind design with the SFE and Crystalis strategies on various aspects: word recognition in quiet, speech-in-noise reception threshold (SRT), consonant discrimination in quiet, listening effort, melody contour identification (MCI), and subjective sound quality. Word recognition and SRTs were measured on the first and last day of testing (4 to 5 days apart) to assess potential learning and/or acclimatization effects. Other tests were run once between the first and last testing day. Listening effort was assessed by measuring pupil dilation. MCI involved identifying a five-tone contour among five possible contours. Sound quality was assessed subjectively using the multiple stimulus with hidden reference and anchor (MUSHRA) paradigm for sentences, music, and ambient sounds. To examine the second hypothesis, cross-channel interaction was assessed behaviorally using forward masking. RESULTS Word recognition was similar for the two strategies on the first day of testing and improved for both strategies on the last day of testing, with Crystalis improving significantly more. SRTs were worse with SFE than Crystalis on the first day of testing but became comparable on the last day of testing. Consonant discrimination scores were higher for Crystalis than for the SFE strategy. MCI scores and listening effort were not substantially different across strategies. Subjective sound quality scores were lower for the SFE than for the Crystalis strategy. The difference in performance with SFE and Crystalis was greater for CI users with higher channel interaction. CONCLUSIONS CI-user performance was similar with the SFE and Crystalis strategies. Longer acclimatization times may be required to reveal the full potential of the SFE strategy.
Collapse
Affiliation(s)
- Yue Zhang
- Department of Research and Technology, Oticon Medical, Vallauris, France
| | - Peter T Johannesen
- Laboratorio de Audición Computacional y Piscoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Salamanca, Spain
- Grupo de Audiología, Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca, Salamanca, Spain
| | | | - Aswin Wijetillake
- Department of Research and Technology, Oticon Medical, Smørum, Denmark
| | | | - Pierre-Yves Hasan
- Department of Research and Technology, Oticon Medical, Smørum, Denmark
| | | | - Enrique A Lopez-Poveda
- Laboratorio de Audición Computacional y Piscoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Salamanca, Spain
- Grupo de Audiología, Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca, Salamanca, Spain
- Departamento de Cirugía, Facultad de Medicina, Universidad de Salamanca, Salamanca, Spain
| |
Collapse
|
5
|
T S, Sasidharan M, Lavanya V. Development of Environmental Sound Perception in Children with Cochlear Implant within 4 Months of Implantation. Indian J Otolaryngol Head Neck Surg 2024; 76:3088-3093. [PMID: 39130335 PMCID: PMC11306886 DOI: 10.1007/s12070-024-04607-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 03/04/2024] [Indexed: 08/13/2024] Open
Abstract
Aim The aim of the study was (1) to investigate the development of identification of environmental sounds in children with Cochlear Implantation (CI) within four months from switch on (i.e. at 0, 2 and 4 months) and (2) to see the effect of family type in the perception of environmental sounds. Materials and methods A longitudinal study design was utilized on a total of 18 children using CI within the chronological age range of 3 to 7 years. All participants underwent a closed set test of Environmental Sound Perception (ESP) to measure the longitudinal outcomes of ESP, at 0 (within 1 week of switch on), 2 months and 4 months of implant age. They were asked to identify the sounds by pointing at the picture representing the sound. Results Results using One-way and Two-way ANOVA demonstrated that at 0 month of implant age, the scores were 0%. At 2 months of implant age the scores ranged from 0 to 25% and at 4th month the scores ranged from 0 to 40%. There was a statistically significant improvement observed in ESP at every 2 months of testing from 0 to 4 months of implant age. However, effect of family type revealed no significant differences between the performances across the implant age. Conclusion The current study reveals that identification of environmental sounds are one of the foremost benefits and early outcomes of CI in children. The perception of environmental sounds are constantly but gradually developing with increasing implant age. This information is useful to predict the performance of CI during rehabilitation and to set the therapy goals accordingly.
Collapse
Affiliation(s)
- Sheela T
- Dr. S.R. Chandrasekhar Institute of Speech and Hearing, Bangalore, India
| | - Megha Sasidharan
- Dr. S.R. Chandrasekhar Institute of Speech and Hearing, Bangalore, India
| | - V Lavanya
- Sri Ramachandra Institute of Higher Education and Research, Chennai, India
| |
Collapse
|
6
|
Persici V, Castelletti G, Guerzoni L, Cuda D, Majorano M. The role of lexical and prosodic characteristics of mothers' child-directed speech for the early vocabulary development of Italian children with cochlear implants. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2024. [PMID: 38978277 DOI: 10.1111/1460-6984.13087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 06/18/2024] [Indexed: 07/10/2024]
Abstract
BACKGROUND Variability in the vocabulary outcomes of children with cochlear implants (CIs) is partially explained by child-directed speech (CDS) characteristics. Yet, relatively little is known about whether and how mothers adapt their lexical and prosodic characteristics to the child's hearing status (before and after implantation, and compared with groups with normal hearing (NH)) and how important they are in affecting vocabulary development in the first 12 months of hearing experience. AIMS To investigate whether mothers of children with CIs produce CDS with similar lexical and prosodic characteristics compared with mothers of age-matched children with NH, and whether they modify these characteristics after implantation. In addition, to investigate whether mothers' CDS characteristics predict children's early vocabulary skills before and after implantation. METHODS & PROCEDURES A total of 34 dyads (17 with NH, 17 with children with CIs; ages = 9-32 months), all acquiring Italian, were involved in the study. Mothers' and children's lexical quantity (tokens) and variety (types), mothers' prosodic characteristics (pitch range and variability), and children's vocabulary skills were assessed at two time points, corresponding to before and 1 year post-CI activation for children with CIs. Children's vocabulary skills were assessed using parent reports; lexical and prosodic characteristics were observed in semi-structured mother-child interactions. OUTCOMES & RESULTS Results showed that mothers of children with CIs produced speech with similar lexical quantity but lower lexical variety, and with increased pitch range and variability, than mothers of children with NH. Mothers generally increased their lexical quantity and variety and their pitch range between sessions. Children with CIs showed reduced expressive vocabulary and lower lexical quantity and variety than their peers 12 months post-CI activation. Mothers' prosodic characteristics did not explain variance in children's vocabulary skills; their lexical characteristics predicted children's early vocabulary and lexical outcomes, especially in the NH group, but were not related to later language development. CONCLUSIONS & IMPLICATIONS Our findings confirm previous studies on other languages and support the idea that the lexical characteristics of mothers' CDS have a positive effect on children's early measures of vocabulary development across hearing groups, whereas prosodic cues play a minor role. Greater input quantity and quality may assist children in the building of basic language model representations, whereas pitch cues may mainly serve attentional and emotional processes. Results emphasize the need for additional longitudinal studies investigating the input received from other figures surrounding the child and its role for children's language development. WHAT THIS PAPER ADDS What is already known on the subject Mothers' CDS is thought to facilitate and support language acquisition in children with various language developmental trajectories, including children with CIs. Because children with CIs are at risk for language delays and have acoustic processing limitations, their mothers may have to produce a lexically simpler but prosodically richer input, compared to mothers of children with NH. Yet, the literature reports mixed findings and no study to our knowledge has concurrently addressed the role of mothers' lexical and prosodic characteristics for children's vocabulary development before implantation and in the first 12 months of hearing experience. What this study adds to the existing knowledge The study shows that mothers of children with CIs produce input of similar quantity but reduced variety, and with heightened pitch characteristics, compared to mothers of children with NH. There was also a general increase in mothers' lexical quantity and variety, and in their pitch range, between sessions. Only their lexical characteristics predicted children's early vocabulary skills. Their lexical variety predicted children's expressive vocabulary and lexical variety only in the NH group. What are the practical and clinical implications of this work? These findings expand our knowledge about the effects of maternal input and may contribute to the improvement of early family-centred intervention programmes for supporting language development in children with CIs.
Collapse
Affiliation(s)
- Valentina Persici
- Department of Human Sciences, University of Verona, Verona, Italy
- Department of Humanities, University of Urbino 'Carlo Bo', Urbino, Italy
| | | | - Letizia Guerzoni
- Otorhinolaryngology Unit, 'Guglielmo da Saliceto' Hospital, Piacenza, Italy
| | - Domenico Cuda
- Otorhinolaryngology Unit, 'Guglielmo da Saliceto' Hospital, Piacenza, Italy
- University of Parma, Parma, Italy
| | | |
Collapse
|
7
|
Buss E, Richter ME, Sweeney VN, Davis AG, Dillon MT, Park LR. Effect of Age and Unaided Acoustic Hearing on Pediatric Cochlear Implant Users' Ability to Distinguish Yes/No Statements and Questions. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:1932-1944. [PMID: 38748909 DOI: 10.1044/2024_jslhr-23-00631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
PURPOSE The purpose of this study was to evaluate the ability to discriminate yes/no questions from statements in three groups of children: bilateral cochlear implant (CI) users, nontraditional CI users with aidable hearing preoperatively in the ear to be implanted, and controls with normal hearing. Half of the nontraditional CI users had sufficient postoperative acoustic hearing in the implanted ear to use electric-acoustic stimulation, and half used a CI alone. METHOD Participants heard recorded sentences that were produced either as yes/no questions or as statements by three male and three female talkers. Three raters scored each participant response as either a question or a statement. Bilateral CI users (n = 40, 4-12 years old) and normal-hearing controls (n = 10, 4-12 years old) were tested binaurally in the free field. Nontraditional CI recipients (n = 22, 6-17 years old) were tested with direct audio input to the study ear. RESULTS For the bilateral CI users, performance was predicted by age but not by 125-Hz acoustic thresholds; just under half (n = 17) of the participants in this group had measurable 125-Hz thresholds in their better ear. For nontraditional CI recipients, better performance was predicted by lower 125-Hz acoustic thresholds in the test ear, and there was no association with participant age. Performance approached that of the normal-hearing controls for some participants in each group. CONCLUSIONS Results suggest that a 125-Hz acoustic hearing supports discrimination of yes/no questions and statements in pediatric CI users. Bilateral CI users with little or no acoustic hearing at 125 Hz develop the ability to perform this task, but that ability emerges later than for children with better acoustic hearing. These results underscore the importance of preserving acoustic hearing for pediatric CI users when possible.
Collapse
Affiliation(s)
- Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
| | - Margaret E Richter
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
| | - Victoria N Sweeney
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
- Center for Hearing Research, Boys Town National Research Hospitals, Omaha, NE
| | - Amanda G Davis
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
| | - Margaret T Dillon
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
| | - Lisa R Park
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
| |
Collapse
|
8
|
Zhang H, Dai X, Ma W, Ding H, Zhang Y. Investigating Perception to Production Transfer in Children With Cochlear Implants: A High Variability Phonetic Training Study. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:1206-1228. [PMID: 38466170 DOI: 10.1044/2023_jslhr-23-00573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
PURPOSE This study builds upon an established effective training method to investigate the advantages of high variability phonetic identification training for enhancing lexical tone perception and production in Mandarin-speaking pediatric cochlear implant (CI) recipients, who typically face ongoing challenges in these areas. METHOD Thirty-two Mandarin-speaking children with CIs were quasirandomly assigned into the training group (TG) and the control group (CG). The 16 TG participants received five sessions of high variability phonetic training (HVPT) within a period of 3 weeks. The CG participants did not receive the training. Perception and production of Mandarin tones were administered before (pretest) and immediately after (posttest) the completion of HVPT via lexical tone recognition task and picture naming task. Both groups participated in the identical pretest and posttest with the same time frame between the two test sessions. RESULTS TG showed significant improvement from pretest to posttest in identifying Mandarin tones for both trained and untrained speech stimuli. Moreover, perceptual learning of HVPT significantly facilitated trainees' production of T1 and T2 as rated by a cohort of 10 Mandarin-speaking adults with normal hearing, which was corroborated by acoustic analyses revealing improved fundamental frequency (F0) median for T1 and T2 production and enlarged F0 movement for T2 production. In contrast, TG children's production of T3 and T4 showed nonsignificant changes across two test sessions. Meanwhile, CG did not exhibit significant changes in either perception or production. CONCLUSIONS The results suggest a limited and inconsistent transfer of perceptual learning to lexical tone production in children with CIs, which challenges the notion of a robust transfer and highlights the complexity of the interaction between perceptual training and production outcomes. Further research on individual differences with a longitudinal design is needed to optimize the training protocol or tailor interventions to better meet the diverse needs of learners.
Collapse
Affiliation(s)
- Hao Zhang
- Center for Clinical Neurolinguistics, School of Foreign Languages and Literature, Shandong University, Jinan, China
| | - Xuequn Dai
- Center for Clinical Neurolinguistics, School of Foreign Languages and Literature, Shandong University, Jinan, China
| | - Wen Ma
- Center for Clinical Neurolinguistics, School of Foreign Languages and Literature, Shandong University, Jinan, China
| | - Hongwei Ding
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences and Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis
| |
Collapse
|
9
|
Sathe NC, Kain A, Reiss LAJ. Fusion of dichotic consonants in normal-hearing and hearing-impaired listenersa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:68-77. [PMID: 38174963 PMCID: PMC10990566 DOI: 10.1121/10.0024245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 12/09/2023] [Accepted: 12/13/2023] [Indexed: 01/05/2024]
Abstract
Hearing-impaired (HI) listeners have been shown to exhibit increased fusion of dichotic vowels, even with different fundamental frequency (F0), leading to binaural spectral averaging and interference. To determine if similar fusion and averaging occurs for consonants, four natural and synthesized stop consonants (/pa/, /ba/, /ka/, /ga/) at three F0s of 74, 106, and 185 Hz were presented dichotically-with ΔF0 varied-to normal-hearing (NH) and HI listeners. Listeners identified the one or two consonants perceived, and response options included /ta/ and /da/ as fused percepts. As ΔF0 increased, both groups showed decreases in fusion and increases in percent correct identification of both consonants, with HI listeners displaying similar fusion but poorer identification. Both groups exhibited spectral averaging (psychoacoustic fusion) of place of articulation but phonetic feature fusion for differences in voicing. With synthetic consonants, NH subjects showed increased fusion and decreased identification. Most HI listeners were unable to discriminate the synthetic consonants. The findings suggest smaller differences between groups in consonant fusion than vowel fusion, possibly due to the presence of more cues for segregation in natural speech or reduced reliance on spectral cues for consonant perception. The inability of HI listeners to discriminate synthetic consonants suggests a reliance on cues other than formant transitions for consonant discrimination.
Collapse
Affiliation(s)
- Nishad C Sathe
- Oregon Health and Science University, Portland, Oregon 97239, USA
| | - Alexander Kain
- Oregon Health and Science University, Portland, Oregon 97239, USA
| | - Lina A J Reiss
- Oregon Health and Science University, Portland, Oregon 97239, USA
| |
Collapse
|
10
|
Zaltz Y. The Impact of Trained Conditions on the Generalization of Learning Gains Following Voice Discrimination Training. Trends Hear 2024; 28:23312165241275895. [PMID: 39212078 PMCID: PMC11367600 DOI: 10.1177/23312165241275895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 06/03/2024] [Accepted: 07/29/2024] [Indexed: 09/04/2024] Open
Abstract
Auditory training can lead to notable enhancements in specific tasks, but whether these improvements generalize to untrained tasks like speech-in-noise (SIN) recognition remains uncertain. This study examined how training conditions affect generalization. Fifty-five young adults were divided into "Trained-in-Quiet" (n = 15), "Trained-in-Noise" (n = 20), and "Control" (n = 20) groups. Participants completed two sessions. The first session involved an assessment of SIN recognition and voice discrimination (VD) with word or sentence stimuli, employing combined fundamental frequency (F0) + formant frequencies voice cues. Subsequently, only the trained groups proceeded to an interleaved training phase, encompassing six VD blocks with sentence stimuli, utilizing either F0-only or formant-only cues. The second session replicated the interleaved training for the trained groups, followed by a second assessment conducted by all three groups, identical to the first session. Results showed significant improvements in the trained task regardless of training conditions. However, VD training with a single cue did not enhance VD with both cues beyond control group improvements, suggesting limited generalization. Notably, the Trained-in-Noise group exhibited the most significant SIN recognition improvements posttraining, implying generalization across tasks that share similar acoustic conditions. Overall, findings suggest training conditions impact generalization by influencing processing levels associated with the trained task. Training in noisy conditions may prompt higher auditory and/or cognitive processing than training in quiet, potentially extending skills to tasks involving challenging listening conditions, such as SIN recognition. These insights hold significant theoretical and clinical implications, potentially advancing the development of effective auditory training protocols.
Collapse
Affiliation(s)
- Yael Zaltz
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Faculty of Medicine, and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
11
|
Levin M, Zaltz Y. Voice Discrimination in Quiet and in Background Noise by Simulated and Real Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:5169-5186. [PMID: 37992412 DOI: 10.1044/2023_jslhr-23-00019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2023]
Abstract
PURPOSE Cochlear implant (CI) users demonstrate poor voice discrimination (VD) in quiet conditions based on the speaker's fundamental frequency (fo) and formant frequencies (i.e., vocal-tract length [VTL]). Our purpose was to examine the effect of background noise at levels that allow good speech recognition thresholds (SRTs) on VD via acoustic CI simulations and CI hearing. METHOD Forty-eight normal-hearing (NH) listeners who listened via noise-excited (n = 20) or sinewave (n = 28) vocoders and 10 prelingually deaf CI users (i.e., whose hearing loss began before language acquisition) participated in the study. First, the signal-to-noise ratio (SNR) that yields 70.7% correct SRT was assessed using an adaptive sentence-in-noise test. Next, the CI simulation listeners performed 12 adaptive VDs: six in quiet conditions, two with each cue (fo, VTL, fo + VTL), and six amid speech-shaped noise. The CI participants performed six VDs: one with each cue, in quiet and amid noise. SNR at VD testing was 5 dB higher than the individual's SRT in noise (SRTn +5 dB). RESULTS Results showed the following: (a) Better VD was achieved via the noise-excited than the sinewave vocoder, with the noise-excited vocoder better mimicking CI VD; (b) background noise had a limited negative effect on VD, only for the CI simulation listeners; and (c) there was a significant association between SNR at testing and VTL VD only for the CI simulation listeners. CONCLUSIONS For NH listeners who listen to CI simulations, noise that allows good SRT can nevertheless impede VD, probably because VD depends more on bottom-up sensory processing. Conversely, for prelingually deaf CI users, noise that allows good SRT hardly affects VD, suggesting that they rely strongly on bottom-up processing for both VD and speech recognition.
Collapse
Affiliation(s)
- Michal Levin
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Faculty of Medicine, Tel Aviv University, Israel
| | - Yael Zaltz
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Faculty of Medicine, Tel Aviv University, Israel
- Sagol School of Neuroscience, Tel Aviv University, Israel
| |
Collapse
|
12
|
Yüksel M, Sarlik E, Çiprut A. Emotions and Psychological Mechanisms of Listening to Music in Cochlear Implant Recipients. Ear Hear 2023; 44:1451-1463. [PMID: 37280743 DOI: 10.1097/aud.0000000000001388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
OBJECTIVES Music is a multidimensional phenomenon and is classified by its arousal properties, emotional quality, and structural characteristics. Although structural features of music (i.e., pitch, timbre, and tempo) and music emotion recognition in cochlear implant (CI) recipients are popular research topics, music-evoked emotions, and related psychological mechanisms that reflect both the individual and social context of music are largely ignored. Understanding the music-evoked emotions (the "what") and related mechanisms (the "why") can help professionals and CI recipients better comprehend the impact of music on CI recipients' daily lives. Therefore, the purpose of this study is to evaluate these aspects in CI recipients and compare their findings to those of normal hearing (NH) controls. DESIGN This study included 50 CI recipients with diverse auditory experiences who were prelingually deafened (deafened at or before 6 years of age)-early implanted (N = 21), prelingually deafened-late implanted (implanted at or after 12 years of age-N = 13), and postlingually deafened (N = 16) as well as 50 age-matched NH controls. All participants completed the same survey, which included 28 emotions and 10 mechanisms (Brainstem reflex, Rhythmic entrainment, Evaluative Conditioning, Contagion, Visual imagery, Episodic memory, Musical expectancy, Aesthetic judgment, Cognitive appraisal, and Lyrics). Data were presented in detail for CI groups and compared between CI groups and between CI and NH groups. RESULTS The principal component analysis showed five emotion factors that are explained by 63.4% of the total variance, including anxiety and anger, happiness and pride, sadness and pain, sympathy and tenderness, and serenity and satisfaction in the CI group. Positive emotions such as happiness, tranquility, love, joy, and trust ranked as most often experienced in all groups, whereas negative and complex emotions such as guilt, fear, anger, and anxiety ranked lowest. The CI group ranked lyrics and rhythmic entrainment highest in the emotion mechanism, and there was a statistically significant group difference in the episodic memory mechanism, in which the prelingually deafened, early implanted group scored the lowest. CONCLUSION Our findings indicate that music can evoke similar emotions in CI recipients with diverse auditory experiences as it does in NH individuals. However, prelingually deafened and early implanted individuals lack autobiographical memories associated with music, which affects the feelings evoked by music. In addition, the preference for rhythmic entrainment and lyrics as mechanisms of music-elicited emotions suggests that rehabilitation programs should pay particular attention to these cues.
Collapse
Affiliation(s)
- Mustafa Yüksel
- Ankara Medipol University School of Health Sciences, Department of Speech and Language Therapy, Ankara, Turkey
| | - Esra Sarlik
- Marmara University Institute of Health Sciences, Audiology and Speech Disorders Program, Istanbul, Turkey
| | - Ayça Çiprut
- Marmara University Faculty of Medicine, Department of Audiology, Istanbul, Turkey
| |
Collapse
|
13
|
Xiang X, Kayser J, Ash S, Zheng C, Sun Y, Weaver A, Dunkle R, Blackburn JA, Halavanau A, Xue J, Himle JA. Web-Based Cognitive Behavioral Therapy for Depression Among Homebound Older Adults: Development and Usability Study. JMIR Aging 2023; 6:e47691. [PMID: 37725423 PMCID: PMC10548322 DOI: 10.2196/47691] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 07/05/2023] [Accepted: 08/17/2023] [Indexed: 09/21/2023] Open
Abstract
BACKGROUND Homebound older adults are a high-risk group for depression. However, many of them face barriers to accessing evidence-supported mental health treatments. Digital mental health interventions can potentially improve treatment access, but few web-based interventions are explicitly tailored for depression in older adults. OBJECTIVE This paper describes the development process of Empower@Home, a web-delivered intervention for depression in homebound older adults that is based on cognitive behavioral therapy, and reports on the outcomes of usability studies. METHODS Empower@Home was developed in collaboration with community agencies, stakeholders, and older adults, guided by user-centered design principles. User needs were assessed through secondary data analysis, demographic and health profiles from administrative data, and interviews and surveys of community partners. A comparative usability evaluation was conducted with 10 older adults to assess the usability of Empower@Home compared to 2 similar programs. Field testing was conducted with 4 end users to detect additional usability issues. RESULTS Feedback and recommendations from community partners heavily influenced the content and design of Empower@Home. The intervention consists of 9 sessions, including psychoeducation and an introduction to cognitive behavioral therapy skills and tools through short video clips, in-session exercises, an animated storyline, and weekly out-of-session home practice. A printed workbook accompanies the web-based lessons. In comparative usability testing (N=10), Empower@Home received a System Usability Scale score of 78 (SD 7.4), which was significantly higher than the 2 comparator programs (t9=3.28; P=.005 and t9=2.78; P=.011). Most participants, 80% (n=8), preferred Empower@Home over the comparators. In the longitudinal field test (n=4), all participants reported liking the program procedures and feeling confident in performing program-related tasks. The single-subject line graph showed an overall downward trend in their depression scores over time, offering an encouraging indication of the intervention's potential effects. CONCLUSIONS Collaboration with community stakeholders and careful consideration of potential implementation issues during the design process can result in more usable, engaging, and effective digital mental health interventions.
Collapse
Affiliation(s)
- Xiaoling Xiang
- School of Social Work, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| | - Jay Kayser
- School of Social Work, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| | - Samson Ash
- School of Social Work, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| | - Chuxuan Zheng
- Department of Psychology, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| | - Yihang Sun
- School of Social Work, Columbia University, New York City, NY, United States
| | - Addie Weaver
- School of Social Work, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| | - Ruth Dunkle
- School of Social Work, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| | - James A Blackburn
- School of Social Work, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| | - Alex Halavanau
- SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA, United States
| | - Jia Xue
- Factor-Inwentash Faculty of Social Work, University of Toronto, Toronto, ON, Canada
- Faculty of Information, University of Toronto, Toronto, ON, Canada
| | - Joseph A Himle
- School of Social Work, University of Michigan-Ann Arbor, Ann Arbor, MI, United States
| |
Collapse
|
14
|
Zhang H, Ma W, Ding H, Zhang Y. Sustainable Benefits of High Variability Phonetic Training in Mandarin-speaking Kindergarteners With Cochlear Implants: Evidence From Categorical Perception of Lexical Tones. Ear Hear 2023; 44:990-1006. [PMID: 36806578 DOI: 10.1097/aud.0000000000001341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
OBJECTIVES Although pitch reception poses a great challenge for individuals with cochlear implants (CIs), formal auditory training (e.g., high variability phonetic training [HVPT]) has been shown to provide direct benefits in pitch-related perceptual performances such as lexical tone recognition for CI users. As lexical tones in spoken language are expressed with a multitude of distinct spectral, temporal, and intensity cues, it is important to determine the sources of training benefits for CI users. The purpose of the present study was to conduct a rigorous fine-scale evaluation with the categorical perception (CP) paradigm to control the acoustic parameters and test the efficacy and sustainability of HVPT for Mandarin-speaking pediatric CI recipients. The main hypothesis was that HVPT-induced perceptual learning would greatly enhance CI users' ability to extract the primary pitch contours from spoken words for lexical tone identification and discrimination. Furthermore, individual differences in immediate and long-term gains from training would likely be attributable to baseline performance and duration of CI use. DESIGN Twenty-eight prelingually deaf Mandarin-speaking kindergarteners with CIs were tested. Half of them received five sessions of HVPT within a period of 3 weeks. The other half served as control who did not receive the formal training. Two classical CP tasks on a tonal continuum from Mandarin tone 1 (high-flat in pitch) to tone 2 (mid-rising in pitch) with fixed acoustic features of duration and intensity were administered before (pretest), immediately after (posttest), and 10 weeks posttraining termination (follow-up test). Participants were instructed to either label a speech stimulus along the continuum (i.e., identification task) or determine whether a pair of stimuli separated by zero or two steps from the continuum was the same or different (i.e., discrimination task). Identification function measures (i.e., boundary position and boundary width) and discrimination function scores (i.e., between-category score, within-category score, and peakedness score) were assessed for each child participant across the three test sessions. RESULTS Linear mixed-effects (LME) models showed significant training-induced enhancement in lexical tone categorization with significantly narrower boundary width and better between-category discrimination in the immediate posttest over pretest for the trainees. Furthermore, training-induced gains were reliably retained in the follow-up test 10 weeks after training. By contrast, no significant changes were found in the control group across sessions. Regression analysis confirmed that baseline performance (i.e., boundary width in the pretest session) and duration of CI use were significant predictors for the magnitude of training-induced benefits. CONCLUSIONS The stringent CP tests with synthesized stimuli that excluded acoustic cues other than the pitch contour and were never used in training showed strong evidence for the efficacy of HVPT in yielding immediate and sustained improvement in lexical tone categorization for Mandarin-speaking children with CIs. The training results and individual differences have remarkable implications for developing personalized computer-based short-term HVPT protocols that may have sustainable long-term benefits for aural rehabilitation in this clinical population.
Collapse
Affiliation(s)
- Hao Zhang
- Center for Clinical Neurolinguistics, School of Foreign Languages and Literature, Shandong University, Jinan, China
| | - Wen Ma
- Center for Clinical Neurolinguistics, School of Foreign Languages and Literature, Shandong University, Jinan, China
| | - Hongwei Ding
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences and Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
15
|
Zaltz Y. The effect of stimulus type and testing method on talker discrimination of school-age children. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2611. [PMID: 37129674 DOI: 10.1121/10.0017999] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 04/12/2023] [Indexed: 05/03/2023]
Abstract
Efficient talker discrimination (TD) improves speech understanding under multi-talker conditions. So far, TD of children has been assessed using various testing parameters, making it difficult to draw comparative conclusions. This study explored the effects of the stimulus type and variability on children's TD. Thirty-two children (7-10 years old) underwent eight TD assessments with fundamental frequency + formant changes using an adaptive procedure. Stimuli included consonant-vowel-consonant words or three-word sentences and were either fixed by run or by trial (changing throughout the run). Cognitive skills were also assessed. Thirty-one adults (18-35 years old) served as controls. The results showed (1) poorer TD for the fixed-by-trial than the fixed-by-run method, with both stimulus types for the adults but only with the words for the children; (2) poorer TD for the words than the sentences with the fixed-by-trial method only for the children; and (3) significant correlations between the children's age and TD. These results support a developmental trajectory in the use of perceptual anchoring for TD and in its reliance on comprehensive acoustic and linguistic information. The finding that the testing parameters may influence the top-down and bottom-up processing for TD should be considered when comparing data across studies or when planning new TD experiments.
Collapse
Affiliation(s)
- Yael Zaltz
- Department of Communication Disorders, The Steyer School of Health Professions, Sackler Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
16
|
Mishra SK, Fu QJ, Galvin JJ, Galindo A. Suprathreshold auditory processes in listeners with normal audiograms but extended high-frequency hearing lossa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2745. [PMID: 37133816 DOI: 10.1121/10.0019337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 04/17/2023] [Indexed: 05/04/2023]
Abstract
Hearing loss in the extended high-frequency (EHF) range (>8 kHz) is widespread among young normal-hearing adults and could have perceptual consequences such as difficulty understanding speech in noise. However, it is unclear how EHF hearing loss might affect basic psychoacoustic processes. The hypothesis that EHF hearing loss is associated with poorer auditory resolution in the standard frequencies was tested. Temporal resolution was characterized by amplitude modulation detection thresholds (AMDTs), and spectral resolution was characterized by frequency change detection thresholds (FCDTs). AMDTs and FCDTs were measured in adults with or without EHF loss but with normal clinical audiograms. AMDTs were measured with 0.5- and 4-kHz carrier frequencies; similarly, FCDTs were measured for 0.5- and 4-kHz base frequencies. AMDTs were significantly higher with the 4 kHz than the 0.5 kHz carrier, but there was no significant effect of EHF loss. There was no significant effect of EHF loss on FCDTs at 0.5 kHz; however, FCDTs were significantly higher at 4 kHz for listeners with than without EHF loss. This suggests that some aspects of auditory resolution in the standard audiometric frequency range may be compromised in listeners with EHF hearing loss despite having a normal audiogram.
Collapse
Affiliation(s)
- Srikanta K Mishra
- Department of Speech, Language and Hearing Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California at Los Angeles (UCLA), Los Angeles, California 90095, USA
| | - John J Galvin
- House Institute Foundation, Los Angeles, California 90075, USA
| | - Andrea Galindo
- Department of Communication Sciences and Disorders, The University of Texas Rio Grande Valley, Edinburg, Texas 78539, USA
| |
Collapse
|
17
|
Haumann NT, Petersen B, Friis Andersen AS, Faulkner KF, Brattico E, Vuust P. Mismatch negativity as a marker of music perception in individual cochlear implant users: A spike density component analysis study. Clin Neurophysiol 2023; 148:76-92. [PMID: 36822119 DOI: 10.1016/j.clinph.2023.01.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 01/19/2023] [Accepted: 01/24/2023] [Indexed: 02/10/2023]
Abstract
OBJECTIVE Ninety percent of cochlear implant (CI) users are interested in improving their music perception. However, only few objective behavioral and neurophysiological tests have been developed for tracing the development of music discrimination skills in CI users. In this study, we aimed to obtain an accurate individual mismatch negativity (MMN) marker that could predict behavioral auditory discrimination thresholds. METHODS We measured the individual MMN response to four magnitudes of deviations in four different musical features (intensity, pitch, timbre, and rhythm) in a rare sample of experienced CI users and a control sample of normally hearing participants. We applied a recently developed spike density component analysis (SCA), which can suppress confounding alpha waves, and contrasted it with previously proposed methods. RESULTS Statistically detected individual MMN predicted attentive sound discrimination ability with high accuracy: for CI users 89.2% (278/312 cases) and for controls 90.5% (384/424 cases). As expected, MMN was detected for fewer CI users when the sound deviants were of smaller magnitude. CONCLUSIONS The findings support the use of MMN responses in individual CI users as a diagnostic tool for testing music perception. SIGNIFICANCE For CI users, the new SCA method provided more accurate and replicable diagnostic detections than preceding state-of-the-art.
Collapse
Affiliation(s)
- Niels Trusbak Haumann
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University and The Royal Academy of Music, Aarhus/Aalborg, Universitetsbyen 3, 8000 Aarhus C, Denmark.
| | - Bjørn Petersen
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University and The Royal Academy of Music, Aarhus/Aalborg, Universitetsbyen 3, 8000 Aarhus C, Denmark
| | - Anne Sofie Friis Andersen
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University and The Royal Academy of Music, Aarhus/Aalborg, Universitetsbyen 3, 8000 Aarhus C, Denmark
| | | | - Elvira Brattico
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University and The Royal Academy of Music, Aarhus/Aalborg, Universitetsbyen 3, 8000 Aarhus C, Denmark
| | - Peter Vuust
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University and The Royal Academy of Music, Aarhus/Aalborg, Universitetsbyen 3, 8000 Aarhus C, Denmark
| |
Collapse
|
18
|
Musical Mistuning Perception and Appraisal in Cochlear Implant Recipients. Otol Neurotol 2023; 44:e281-e286. [PMID: 36922018 DOI: 10.1097/mao.0000000000003860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
OBJECTIVE Music is a very crucial art form that can evoke emotions, and the harmonious presence of the human voice in music is an impactful part of this process. As a result, vocals have had some significant effects on contemporary music. The mechanism behind the cochlear implant (CI) recipients perceiving different aspects of music is clear; however, how well they perceive vocal tuning within music it is not well known. Hence, this study evaluated the mistuning perception of CI recipients and compared their performance with normal-hearing (NH) listeners. STUDY DESIGN, SETTING, AND PATIENTS A total of 16 CI users (7 cisgender men, 9 cisgender women) and 16 sex-matched NH controls with an average age of 30.2 (±10.9; range, 19-53) years and 23.5 (±6.1; range, 20-37) years, respectively, were enrolled in this study. We evaluated the mistuning ability using the mistuning perception test (MPT) and assessed self-perceived music perception and engagement using the music-related quality-of-life questionnaire. Test performance was measured and reported on the item-response theory metric with a z score ranging from -4 to +4. RESULTS A significant difference in the MPT scores was found between NH and CI recipients, whereas a significant correlation was noted between the music-related quality-of-life questionnaire-frequency subscale and MPT scores. No significant correlations were found between age, CI age, and CI usage duration and MPT performance. CONCLUSIONS This study revealed that musical mistuning perception is a limitation for CI recipients, similar to previously evaluated aspects of music perception. Hence, it is important to consider this aspect in the assessment of music perception, enjoyment, and music-based auditory interventions in CI recipients, as vocals are paramount in music perception and recreation. The MPT is a convenient and accessible tool for mistuning assessment in CI and hearing-aid users.
Collapse
|
19
|
Steinmetzger K, Rosen S. No evidence for a benefit from masker harmonicity in the perception of speech in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:1064. [PMID: 36859153 DOI: 10.1121/10.0017065] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 01/10/2023] [Indexed: 06/18/2023]
Abstract
When assessing the intelligibility of speech embedded in background noise, maskers with a harmonic spectral structure have been found to be much less detrimental to performance than noise-based interferers. While spectral "glimpsing" in between the resolved masker harmonics and reduced envelope modulations of harmonic maskers have been shown to contribute, this effect has primarily been attributed to the proposed ability of the auditory system to cancel harmonic maskers from the signal mixture. Here, speech intelligibility in the presence of harmonic and inharmonic maskers with similar spectral glimpsing opportunities and envelope modulation spectra was assessed to test the theory of harmonic cancellation. Speech reception thresholds obtained from normal-hearing listeners revealed no effect of masker harmonicity, neither for maskers with static nor dynamic pitch contours. The results show that harmonicity, or time-domain periodicity, as such, does not aid the segregation of speech and masker. Contrary to what might be assumed, this also implies that the saliency of the masker pitch did not affect auditory grouping. Instead, the current data suggest that the reduced masking effectiveness of harmonic sounds is due to the regular spacing of their spectral components.
Collapse
Affiliation(s)
- Kurt Steinmetzger
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany
| | - Stuart Rosen
- Speech, Hearing and Phonetic Sciences, University College London (UCL), Chandler House, 2 Wakefield Street, London, WC1N 1PF, United Kingdom
| |
Collapse
|
20
|
Gohari N, Dastgerdi ZH, Rouhbakhsh N, Afshar S, Mobini R. Training Programs for Improving Speech Perception in Noise: A Review. J Audiol Otol 2023; 27:1-9. [PMID: 36710414 PMCID: PMC9884994 DOI: 10.7874/jao.2022.00283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 10/26/2022] [Indexed: 01/20/2023] Open
Abstract
Understanding speech in the presence of noise is difficult and challenging, even for people with normal hearing. Accurate pitch perception, coding and decoding of temporal and intensity cues, and cognitive factors are involved in speech perception in noise (SPIN); disruption in any of these can be a barrier to SPIN. Because the physiological representations of sounds can be corrected by exercises, training methods for any impairment can be used to improve speech perception. This study describes the various types of bottom-up training methods: pitch training based on fundamental frequency (F0) and harmonics; spatial, temporal, and phoneme training; and top-down training methods, such as cognitive training of functional memory. This study also discusses music training that affects both bottom-up and top-down components and speech training in noise. Given the effectiveness of all these training methods, we recommend identifying the defects underlying SPIN disorders and selecting the best training approach.
Collapse
Affiliation(s)
- Nasrin Gohari
- Hearing Disorders Research Center, Department of Audiology, School of Rehabilitation, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Zahra Hosseini Dastgerdi
- Department of Audiology, School of Rehabilitation, Isfahan University of Medical Sciences, Isfahan, Iran,Address for correspondence Zahra Hosseini Dastgerdi, PhD Department of Audiology, School of Rehabilitation, Isfahan University of Medical Sciences, Isfahan, Iran Tel +98-09132947800 Fax +98-(311)5145-668 E-mail
| | - Nematollah Rouhbakhsh
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Tehran, Iran
| | - Sara Afshar
- Hearing Disorders Research Center, Department of Audiology, School of Rehabilitation, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Razieh Mobini
- Hearing Disorders Research Center, Department of Audiology, School of Rehabilitation, Hamadan University of Medical Sciences, Hamadan, Iran
| |
Collapse
|
21
|
Shim H, Kim S, Hong J, Na Y, Woo J, Hansen M, Gantz B, Choi I. Differences in neural encoding of speech in noise between cochlear implant users with and without preserved acoustic hearing. Hear Res 2023; 427:108649. [PMID: 36462377 PMCID: PMC9842477 DOI: 10.1016/j.heares.2022.108649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 11/06/2022] [Accepted: 11/12/2022] [Indexed: 11/15/2022]
Abstract
Cochlear implants (CIs) have evolved to combine residual acoustic hearing with electric hearing. It has been expected that CI users with residual acoustic hearing experience better speech-in-noise perception than CI-only listeners because preserved acoustic cues aid unmasking speech from background noise. This study sought neural substrate of better speech unmasking in CI users with preserved acoustic hearing compared to those with lower degree of acoustic hearing. Cortical evoked responses to speech in multi-talker babble noise were compared between 29 Hybrid (i.e., electric acoustic stimulation or EAS) and 29 electric-only CI users. The amplitude ratio of evoked responses to speech and noise, or internal SNR, was significantly larger in the CI users with EAS. This result indicates that CI users with better residual acoustic hearing exhibit enhanced unmasking of speech from background noise.
Collapse
Affiliation(s)
- Hwan Shim
- Dept. Electrical and Computer Engineering Technology, Rochester Institute of Technology, Rochester, NY 14623, United States
| | - Subong Kim
- Dept. Communication Sciences and Disorders, Montclair State University, Montclair, NJ 07043, United States
| | - Jean Hong
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA 52242, United States
| | - Youngmin Na
- Dept. Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States
| | - Jihwan Woo
- Dept. Biomedical Engineering, University of Ulsan, Ulsan, Republic of Korea
| | - Marlan Hansen
- Dept. Otolaryngology - Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States
| | - Bruce Gantz
- Dept. Otolaryngology - Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States
| | - Inyong Choi
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA 52242, United States; Dept. Otolaryngology - Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, United States.
| |
Collapse
|
22
|
Burleson AM, Souza PE. Cognitive and linguistic abilities and perceptual restoration of missing speech: Evidence from online assessment. Front Psychol 2022; 13:1059192. [PMID: 36571056 PMCID: PMC9773209 DOI: 10.3389/fpsyg.2022.1059192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 11/23/2022] [Indexed: 12/13/2022] Open
Abstract
When speech is clear, speech understanding is a relatively simple and automatic process. However, when the acoustic signal is degraded, top-down cognitive and linguistic abilities, such as working memory capacity, lexical knowledge (i.e., vocabulary), inhibitory control, and processing speed can often support speech understanding. This study examined whether listeners aged 22-63 (mean age 42 years) with better cognitive and linguistic abilities would be better able to perceptually restore missing speech information than those with poorer scores. Additionally, the role of context and everyday speech was investigated using high-context, low-context, and realistic speech corpi to explore these effects. Sixty-three adult participants with self-reported normal hearing completed a short cognitive and linguistic battery before listening to sentences interrupted by silent gaps or noise bursts. Results indicated that working memory was the most reliable predictor of perceptual restoration ability, followed by lexical knowledge, and inhibitory control and processing speed. Generally, silent gap conditions were related to and predicted by a broader range of cognitive abilities, whereas noise burst conditions were related to working memory capacity and inhibitory control. These findings suggest that higher-order cognitive and linguistic abilities facilitate the top-down restoration of missing speech information and contribute to individual variability in perceptual restoration.
Collapse
|
23
|
Lanzilotti C, Andéol G, Micheyl C, Scannella S. Cocktail party training induces increased speech intelligibility and decreased cortical activity in bilateral inferior frontal gyri. A functional near-infrared study. PLoS One 2022; 17:e0277801. [PMID: 36454948 PMCID: PMC9714910 DOI: 10.1371/journal.pone.0277801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 11/03/2022] [Indexed: 12/03/2022] Open
Abstract
The human brain networks responsible for selectively listening to a voice amid other talkers remain to be clarified. The present study aimed to investigate relationships between cortical activity and performance in a speech-in-speech task, before (Experiment I) and after training-induced improvements (Experiment II). In Experiment I, 74 participants performed a speech-in-speech task while their cortical activity was measured using a functional near infrared spectroscopy (fNIRS) device. One target talker and one masker talker were simultaneously presented at three different target-to-masker ratios (TMRs): adverse, intermediate and favorable. Behavioral results show that performance may increase monotonically with TMR in some participants and failed to decrease, or even improved, in the adverse-TMR condition for others. On the neural level, an extensive brain network including the frontal (left prefrontal cortex, right dorsolateral prefrontal cortex and bilateral inferior frontal gyri) and temporal (bilateral auditory cortex) regions was more solicited by the intermediate condition than the two others. Additionally, bilateral frontal gyri and left auditory cortex activities were found to be positively correlated with behavioral performance in the adverse-TMR condition. In Experiment II, 27 participants, whose performance was the poorest in the adverse-TMR condition of Experiment I, were trained to improve performance in that condition. Results show significant performance improvements along with decreased activity in bilateral inferior frontal gyri, the right dorsolateral prefrontal cortex, the left inferior parietal cortex and the right auditory cortex in the adverse-TMR condition after training. Arguably, lower neural activity reflects higher efficiency in processing masker inhibition after speech-in-speech training. As speech-in-noise tasks also imply frontal and temporal regions, we suggest that regardless of the type of masking (speech or noise) the complexity of the task will prompt the implication of a similar brain network. Furthermore, the initial significant cognitive recruitment will be reduced following a training leading to an economy of cognitive resources.
Collapse
Affiliation(s)
- Cosima Lanzilotti
- Département Neuroscience et Sciences Cognitives, Institut de Recherche Biomédicale des Armées, Brétigny sur Orge, France
- ISAE-SUPAERO, Université de Toulouse, Toulouse, France
- Thales SIX GTS France, Gennevilliers, France
| | - Guillaume Andéol
- Département Neuroscience et Sciences Cognitives, Institut de Recherche Biomédicale des Armées, Brétigny sur Orge, France
| | | | | |
Collapse
|
24
|
Anderson SR, Kan A, Litovsky RY. Asymmetric temporal envelope sensitivity: Within- and across-ear envelope comparisons in listeners with bilateral cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3294. [PMID: 36586876 PMCID: PMC9731674 DOI: 10.1121/10.0016365] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 11/14/2022] [Accepted: 11/16/2022] [Indexed: 06/17/2023]
Abstract
For listeners with bilateral cochlear implants (BiCIs), patient-specific differences in the interface between cochlear implant (CI) electrodes and the auditory nerve can lead to degraded temporal envelope information, compromising the ability to distinguish between targets of interest and background noise. It is unclear how comparisons of degraded temporal envelope information across spectral channels (i.e., electrodes) affect the ability to detect differences in the temporal envelope, specifically amplitude modulation (AM) rate. In this study, two pulse trains were presented simultaneously via pairs of electrodes in different places of stimulation, within and/or across ears, with identical or differing AM rates. Results from 11 adults with BiCIs indicated that sensitivity to differences in AM rate was greatest when stimuli were paired between different places of stimulation in the same ear. Sensitivity from pairs of electrodes was predicted by the poorer electrode in the pair or the difference in fidelity between both electrodes in the pair. These findings suggest that electrodes yielding poorer temporal fidelity act as a bottleneck to comparisons of temporal information across frequency and ears, limiting access to the cues used to segregate sounds, which has important implications for device programming and optimizing patient outcomes with CIs.
Collapse
Affiliation(s)
- Sean R Anderson
- Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA
| | - Alan Kan
- School of Engineering, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Ruth Y Litovsky
- Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA
| |
Collapse
|
25
|
Gohari N, Hosseini Dastgerdi Z, Bernstein LJ, Alain C. Neural correlates of concurrent sound perception: A review and guidelines for future research. Brain Cogn 2022; 163:105914. [PMID: 36155348 DOI: 10.1016/j.bandc.2022.105914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 08/30/2022] [Accepted: 09/02/2022] [Indexed: 11/02/2022]
Abstract
The perception of concurrent sound sources depends on processes (i.e., auditory scene analysis) that fuse and segregate acoustic features according to harmonic relations, temporal coherence, and binaural cues (encompass dichotic pitch, location difference, simulated echo). The object-related negativity (ORN) and P400 are electrophysiological indices of concurrent sound perception. Here, we review the different paradigms used to study concurrent sound perception and the brain responses obtained from these paradigms. Recommendations regarding the design and recording parameters of the ORN and P400 are made, and their clinical applications in assessing central auditory processing ability in different populations are discussed.
Collapse
Affiliation(s)
- Nasrin Gohari
- Department of Audiology, School of Rehabilitation, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Zahra Hosseini Dastgerdi
- Department of Audiology, School of Rehabilitation, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Lori J Bernstein
- Department of Supportive Care, University Health Network, and Department of Psychiatry, University of Toronto, Toronto, Canada
| | - Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care & Department of Psychology, University of Toronto, Canada
| |
Collapse
|
26
|
Sauvé SA, Marozeau J, Rich Zendel B. The effects of aging and musicianship on the use of auditory streaming cues. PLoS One 2022; 17:e0274631. [PMID: 36137151 PMCID: PMC9498935 DOI: 10.1371/journal.pone.0274631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 08/31/2022] [Indexed: 11/22/2022] Open
Abstract
Auditory stream segregation, or separating sounds into their respective sources and tracking them over time, is a fundamental auditory ability. Previous research has separately explored the impacts of aging and musicianship on the ability to separate and follow auditory streams. The current study evaluated the simultaneous effects of age and musicianship on auditory streaming induced by three physical features: intensity, spectral envelope and temporal envelope. In the first study, older and younger musicians and non-musicians with normal hearing identified deviants in a four-note melody interleaved with distractors that were more or less similar to the melody in terms of intensity, spectral envelope and temporal envelope. In the second study, older and younger musicians and non-musicians participated in a dissimilarity rating paradigm with pairs of melodies that differed along the same three features. Results suggested that auditory streaming skills are maintained in older adults but that older adults rely on intensity more than younger adults while musicianship is associated with increased sensitivity to spectral and temporal envelope, acoustic features that are typically less effective for stream segregation, particularly in older adults.
Collapse
Affiliation(s)
- Sarah A. Sauvé
- Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada
| | - Jeremy Marozeau
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Benjamin Rich Zendel
- Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John’s, Newfoundland and Labrador, Canada
| |
Collapse
|
27
|
Steinmetzger K, Meinhardt B, Praetorius M, Andermann M, Rupp A. A direct comparison of voice pitch processing in acoustic and electric hearing. Neuroimage Clin 2022; 36:103188. [PMID: 36113196 PMCID: PMC9483634 DOI: 10.1016/j.nicl.2022.103188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/24/2022] [Accepted: 09/06/2022] [Indexed: 12/14/2022]
Abstract
In single-sided deafness patients fitted with a cochlear implant (CI) in the affected ear and preserved normal hearing in the other ear, acoustic and electric hearing can be directly compared without the need for an external control group. Although poor pitch perception is a crucial limitation when listening through CIs, it remains unclear how exactly the cortical processing of pitch information differs between acoustic and electric hearing. Hence, we separately presented both ears of 20 of these patients with vowel sequences in which the pitch contours were either repetitive or variable, while simultaneously recording functional near-infrared spectroscopy (fNIRS) and EEG data. Overall, the results showed smaller and delayed auditory cortex activity in electric hearing, particularly for the P2 event-related potential component, which appears to reflect the processing of voice pitch information. Both the fNIRS data and EEG source reconstructions furthermore showed that vowel sequences with variable pitch contours evoked additional activity in posterior right auditory cortex in electric but not acoustic hearing. This surprising discrepancy demonstrates, firstly, that the acoustic detail transmitted by CIs is sufficient to distinguish between speech sounds that only vary regarding their pitch information. Secondly, the absence of a condition difference when stimulating the normal-hearing ears suggests a saturation of cortical activity levels following unilateral deafness. Taken together, these results provide strong evidence in favour of using CIs in this patient group.
Collapse
Affiliation(s)
- Kurt Steinmetzger
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany,Corresponding author.
| | - Bastian Meinhardt
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany
| | - Mark Praetorius
- Section of Otology and Neurootology, ENT Clinic, Heidelberg University Hospital, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany
| | - Martin Andermann
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany
| | - André Rupp
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany
| |
Collapse
|
28
|
Fleming JT, Winn MB. Strategic perceptual weighting of acoustic cues for word stress in listeners with cochlear implants, acoustic hearing, or simulated bimodal hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1300. [PMID: 36182279 PMCID: PMC9439712 DOI: 10.1121/10.0013890] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 08/08/2022] [Accepted: 08/16/2022] [Indexed: 05/28/2023]
Abstract
Perception of word stress is an important aspect of recognizing speech, guiding the listener toward candidate words based on the perceived stress pattern. Cochlear implant (CI) signal processing is likely to disrupt some of the available cues for word stress, particularly vowel quality and pitch contour changes. In this study, we used a cue weighting paradigm to investigate differences in stress cue weighting patterns between participants listening with CIs and those with normal hearing (NH). We found that participants with CIs gave less weight to frequency-based pitch and vowel quality cues than NH listeners but compensated by upweighting vowel duration and intensity cues. Nonetheless, CI listeners' stress judgments were also significantly influenced by vowel quality and pitch, and they modulated their usage of these cues depending on the specific word pair in a manner similar to NH participants. In a series of separate online experiments with NH listeners, we simulated aspects of bimodal hearing by combining low-pass filtered speech with a vocoded signal. In these conditions, participants upweighted pitch and vowel quality cues relative to a fully vocoded control condition, suggesting that bimodal listening holds promise for restoring the stress cue weighting patterns exhibited by listeners with NH.
Collapse
Affiliation(s)
- Justin T Fleming
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
29
|
Age-Related Changes in Voice Emotion Recognition by Postlingually Deafened Listeners With Cochlear Implants. Ear Hear 2022; 43:323-334. [PMID: 34406157 PMCID: PMC8847542 DOI: 10.1097/aud.0000000000001095] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
OBJECTIVES Identification of emotional prosody in speech declines with age in normally hearing (NH) adults. Cochlear implant (CI) users have deficits in the perception of prosody, but the effects of age on vocal emotion recognition by adult postlingually deaf CI users are not known. The objective of the present study was to examine age-related changes in CI users' and NH listeners' emotion recognition. DESIGN Participants included 18 CI users (29.6 to 74.5 years) and 43 NH adults (25.8 to 74.8 years). Participants listened to emotion-neutral sentences spoken by a male and female talker in five emotions (happy, sad, scared, angry, neutral). NH adults heard them in four conditions: unprocessed (full spectrum) speech, 16-channel, 8-channel, and 4-channel noise-band vocoded speech. The adult CI users only listened to unprocessed (full spectrum) speech. Sensitivity (d') to emotions and Reaction Times were obtained using a single-interval, five-alternative, forced-choice paradigm. RESULTS For NH participants, results indicated age-related declines in Accuracy and d', and age-related increases in Reaction Time in all conditions. Results indicated an overall deficit, as well as age-related declines in overall d' for CI users, but Reaction Times were elevated compared with NH listeners and did not show age-related changes. Analysis of Accuracy scores (hit rates) were generally consistent with d' data. CONCLUSIONS Both CI users and NH listeners showed age-related deficits in emotion identification. The CI users' overall deficit in emotion perception, and their slower response times, suggest impaired social communication which may in turn impact overall well-being, particularly so for older CI users, as lower vocal emotion recognition scores have been associated with poorer subjective quality of life in CI patients.
Collapse
|
30
|
Abstract
Hearing in noise is a core problem in audition, and a challenge for hearing-impaired listeners, yet the underlying mechanisms are poorly understood. We explored whether harmonic frequency relations, a signature property of many communication sounds, aid hearing in noise for normal hearing listeners. We measured detection thresholds in noise for tones and speech synthesized to have harmonic or inharmonic spectra. Harmonic signals were consistently easier to detect than otherwise identical inharmonic signals. Harmonicity also improved discrimination of sounds in noise. The largest benefits were observed for two-note up-down "pitch" discrimination and melodic contour discrimination, both of which could be performed equally well with harmonic and inharmonic tones in quiet, but which showed large harmonic advantages in noise. The results show that harmonicity facilitates hearing in noise, plausibly by providing a noise-robust pitch cue that aids detection and discrimination.
Collapse
|
31
|
Zaltz Y, Kishon-Rabin L. Difficulties Experienced by Older Listeners in Utilizing Voice Cues for Speaker Discrimination. Front Psychol 2022; 13:797422. [PMID: 35310278 PMCID: PMC8928022 DOI: 10.3389/fpsyg.2022.797422] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 01/24/2022] [Indexed: 12/03/2022] Open
Abstract
Human listeners are assumed to apply different strategies to improve speech recognition in background noise. Young listeners with normal hearing (NH), e.g., have been shown to follow the voice of a particular speaker based on the fundamental (F0) and formant frequencies, which are both influenced by the gender, age, and size of the speaker. However, the auditory and cognitive processes that underlie the extraction and discrimination of these voice cues across speakers may be subject to age-related decline. The present study aimed to examine the utilization of F0 and formant cues for voice discrimination (VD) in older adults with hearing expected for their age. Difference limens (DLs) for VD were estimated in 15 healthy older adults (65–78 years old) and 35 young adults (18–35 years old) using only F0 cues, only formant frequency cues, and a combination of F0 + formant frequencies. A three-alternative forced-choice paradigm with an adaptive-tracking threshold-seeking procedure was used. Wechsler backward digit span test was used as a measure of auditory working memory. Trail Making Test (TMT) was used to provide cognitive information reflecting a combined effect of processing speed, mental flexibility, and executive control abilities. The results showed that (a) the mean VD thresholds of the older adults were poorer than those of the young adults for all voice cues, although larger variability was observed among the older listeners; (b) both age groups found the formant cues more beneficial for VD, compared to the F0 cues, and the combined (F0 + formant) cues resulted in better thresholds, compared to each cue separately; (c) significant associations were found for the older adults in the combined F0 + formant condition between VD and TMT scores, and between VD and hearing sensitivity, supporting the notion that a decline with age in both top-down and bottom-up mechanisms may hamper the ability of older adults to discriminate between voices. The present findings suggest that older listeners may have difficulty following the voice of a specific speaker and thus implementing doing so as a strategy for listening amid noise. This may contribute to understanding their reported difficulty listening in adverse conditions.
Collapse
Affiliation(s)
- Yael Zaltz
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Liat Kishon-Rabin
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
32
|
Cortical activity evoked by voice pitch changes: a combined fNIRS and EEG study. Hear Res 2022; 420:108483. [DOI: 10.1016/j.heares.2022.108483] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 03/02/2022] [Accepted: 03/10/2022] [Indexed: 11/22/2022]
|
33
|
Guest DR, Oxenham AJ. Human discrimination and modeling of high-frequency complex tones shed light on the neural codes for pitch. PLoS Comput Biol 2022; 18:e1009889. [PMID: 35239639 PMCID: PMC8923464 DOI: 10.1371/journal.pcbi.1009889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 03/15/2022] [Accepted: 02/02/2022] [Indexed: 11/24/2022] Open
Abstract
Accurate pitch perception of harmonic complex tones is widely believed to rely on temporal fine structure information conveyed by the precise phase-locked responses of auditory-nerve fibers. However, accurate pitch perception remains possible even when spectrally resolved harmonics are presented at frequencies beyond the putative limits of neural phase locking, and it is unclear whether residual temporal information, or a coarser rate-place code, underlies this ability. We addressed this question by measuring human pitch discrimination at low and high frequencies for harmonic complex tones, presented either in isolation or in the presence of concurrent complex-tone maskers. We found that concurrent complex-tone maskers impaired performance at both low and high frequencies, although the impairment introduced by adding maskers at high frequencies relative to low frequencies differed between the tested masker types. We then combined simulated auditory-nerve responses to our stimuli with ideal-observer analysis to quantify the extent to which performance was limited by peripheral factors. We found that the worsening of both frequency discrimination and F0 discrimination at high frequencies could be well accounted for (in relative terms) by optimal decoding of all available information at the level of the auditory nerve. A Python package is provided to reproduce these results, and to simulate responses to acoustic stimuli from the three previously published models of the human auditory nerve used in our analyses.
Collapse
Affiliation(s)
- Daniel R. Guest
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Andrew J. Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
34
|
Abstract
OBJECTIVES The purpose of the present study was to determine whether age and hearing ability influence selective attention during childhood. Specifically, we hypothesized that immaturity and disrupted auditory experience impede selective attention during childhood. DESIGN Seventy-seven school-age children (5 to 12 years of age) participated in this study: 61 children with normal hearing and 16 children with bilateral hearing loss who use hearing aids and/or cochlear implants. Children performed selective attention-based behavioral change detection tasks comprised of target and distractor streams in the auditory and visual modalities. In the auditory modality, children were presented with two streams of single-syllable words spoken by a male and female talker. In the visual modality, children were presented with two streams of grayscale images. In each task, children were instructed to selectively attend to the target stream, inhibit attention to the distractor stream, and press a key as quickly as possible when they detected a frequency (auditory modality) or color (visual modality) deviant stimulus in the target, but not distractor, stream. Performance on the auditory and visual change detection tasks was quantified by response sensitivity, which reflects children's ability to selectively attend to deviants in the target stream and inhibit attention to those in the distractor stream. Children also completed a standardized measure of attention and inhibitory control. RESULTS Younger children and children with hearing loss demonstrated lower response sensitivity, and therefore poorer selective attention, than older children and children with normal hearing, respectively. The effect of hearing ability on selective attention was observed across the auditory and visual modalities, although the extent of this group difference was greater in the auditory modality than the visual modality due to differences in children's response patterns. Additionally, children's performance on a standardized measure of attention and inhibitory control related to their performance during the auditory and visual change detection tasks. CONCLUSIONS Overall, the findings from the present study suggest that age and hearing ability influence children's ability to selectively attend to a target stream in both the auditory and visual modalities. The observed differences in response patterns across modalities, however, reveal a complex interplay between hearing ability, task modality, and selective attention during childhood. While the effect of age on selective attention is expected to reflect the immaturity of cognitive and linguistic processes, the effect of hearing ability may reflect altered development of selective attention due to disrupted auditory experience early in life and/or a differential allocation of attentional resources to meet task demands.
Collapse
|
35
|
Lee JH, Shim H, Gantz B, Choi I. Strength of Attentional Modulation on Cortical Auditory Evoked Responses Correlates with Speech-in-Noise Performance in Bimodal Cochlear Implant Users. Trends Hear 2022; 26:23312165221141143. [PMID: 36464791 PMCID: PMC9726851 DOI: 10.1177/23312165221141143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Revised: 10/10/2022] [Accepted: 10/17/2022] [Indexed: 12/12/2022] Open
Abstract
Auditory selective attention is a crucial top-down cognitive mechanism for understanding speech in noise. Cochlear implant (CI) users display great variability in speech-in-noise performance that is not easily explained by peripheral auditory profile or demographic factors. Thus, it is imperative to understand if auditory cognitive processes such as selective attention explain such variability. The presented study directly addressed this question by quantifying attentional modulation of cortical auditory responses during an attention task and comparing its individual differences with speech-in-noise performance. In our attention experiment, participants with CI were given a pre-stimulus visual cue that directed their attention to either of two speech streams and were asked to select a deviant syllable in the target stream. The two speech streams consisted of the female voice saying "Up" five times every 800 ms and the male voice saying "Down" four times every 1 s. The onset of each syllable elicited distinct event-related potentials (ERPs). At each syllable onset, the difference in the amplitudes of ERPs between the two attentional conditions (attended - ignored) was computed. This ERP amplitude difference served as a proxy for attentional modulation strength. Our group-level analysis showed that the amplitude of ERPs was greater when the syllable was attended than ignored, exhibiting that attention modulated cortical auditory responses. Moreover, the strength of attentional modulation showed a significant correlation with speech-in-noise performance. These results suggest that the attentional modulation of cortical auditory responses may provide a neural marker for predicting CI users' success in clinical tests of speech-in-noise listening.
Collapse
Affiliation(s)
- Jae-Hee Lee
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA, 52242, USA
- Dept. Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Hwan Shim
- Dept. Electrical and Computer Engineering Technology, Rochester Institute of Technology, Rochester, NY, 14623, USA
| | - Bruce Gantz
- Dept. Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Inyong Choi
- Dept. Communication Sciences and Disorders, University of Iowa, Iowa City, IA, 52242, USA
- Dept. Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| |
Collapse
|
36
|
Etard O, Messaoud RB, Gaugain G, Reichenbach T. No Evidence of Attentional Modulation of the Neural Response to the Temporal Fine Structure of Continuous Musical Pieces. J Cogn Neurosci 2021; 34:411-424. [PMID: 35015867 DOI: 10.1162/jocn_a_01811] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Speech and music are spectrotemporally complex acoustic signals that are highly relevant for humans. Both contain a temporal fine structure that is encoded in the neural responses of subcortical and cortical processing centers. The subcortical response to the temporal fine structure of speech has recently been shown to be modulated by selective attention to one of two competing voices. Music similarly often consists of several simultaneous melodic lines, and a listener can selectively attend to a particular one at a time. However, the neural mechanisms that enable such selective attention remain largely enigmatic, not least since most investigations to date have focused on short and simplified musical stimuli. Here, we studied the neural encoding of classical musical pieces in human volunteers, using scalp EEG recordings. We presented volunteers with continuous musical pieces composed of one or two instruments. In the latter case, the participants were asked to selectively attend to one of the two competing instruments and to perform a vibrato identification task. We used linear encoding and decoding models to relate the recorded EEG activity to the stimulus waveform. We show that we can measure neural responses to the temporal fine structure of melodic lines played by one single instrument, at the population level as well as for most individual participants. The neural response peaks at a latency of 7.6 msec and is not measurable past 15 msec. When analyzing the neural responses to the temporal fine structure elicited by competing instruments, we found no evidence of attentional modulation. We observed, however, that low-frequency neural activity exhibited a modulation consistent with the behavioral task at latencies from 100 to 160 msec, in a similar manner to the attentional modulation observed in continuous speech (N100). Our results show that, much like speech, the temporal fine structure of music is tracked by neural activity. In contrast to speech, however, this response appears unaffected by selective attention in the context of our experiment.
Collapse
|
37
|
Scheuregger O, Hjortkjær J, Dau T. Identification and Discrimination of Sound Textures in Hearing-Impaired and Older Listeners. Trends Hear 2021; 25:23312165211065608. [PMID: 34939472 PMCID: PMC8721370 DOI: 10.1177/23312165211065608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Sound textures are a broad class of sounds defined by their homogeneous temporal structure. It has been suggested that sound texture perception is mediated by time-averaged summary statistics measured from early stages of the auditory system. The ability of young normal-hearing (NH) listeners to identify synthetic sound textures increases as the statistics of the synthetic texture approach those of its real-world counterpart. In sound texture discrimination, young NH listeners utilize the fine temporal stimulus information for short-duration stimuli, whereas they switch to a time-averaged statistical representation as the stimulus' duration increases. The present study investigated how younger and older listeners with a sensorineural hearing impairment perform in the corresponding texture identification and discrimination tasks in which the stimuli were amplified to compensate for the individual listeners' loss of audibility. In both hearing impaired (HI) listeners and NH controls, sound texture identification performance increased as the number of statistics imposed during the synthesis stage increased, but hearing impairment was accompanied by a significant reduction in overall identification accuracy. Sound texture discrimination performance was measured across listener groups categorized by age and hearing loss. Sound texture discrimination performance was unaffected by hearing loss at all excerpt durations. The older listeners' sound texture and exemplar discrimination performance decreased for signals of short excerpt duration, with older HI listeners performing better than older NH listeners. The results suggest that the time-averaged statistic representations of sound textures provide listeners with cues which are robust to the effects of age and sensorineural hearing loss.
Collapse
Affiliation(s)
- Oliver Scheuregger
- Hearing Systems Section, Department of Health Technology, 5205Technical University of Denmark, Kongens Lyngby, Denmark
| | - Jens Hjortkjær
- Hearing Systems Section, Department of Health Technology, 5205Technical University of Denmark, Kongens Lyngby, Denmark.,Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Hvidovre, Kettegård Allé 30, DK-2650 Hvidovre, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, 5205Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
38
|
Reverberation Degrades Pitch Perception but Not Mandarin Tone and Vowel Recognition of Cochlear Implant Users. Ear Hear 2021; 43:1139-1150. [PMID: 34799495 DOI: 10.1097/aud.0000000000001173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
OBJECTIVES The primary goal of this study was to investigate the effects of reverberation on Mandarin tone and vowel recognition of cochlear implant (CI) users and normal-hearing (NH) listeners. To understand the performance of Mandarin tone recognition, this study also measured participants' pitch perception and the availability of temporal envelope cues in reverberation. DESIGN Fifteen CI users and nine NH listeners, all Mandarin speakers, were asked to recognize Mandarin single-vowels produced in four lexical tones and rank harmonic complex tones in pitch with different reverberation times (RTs) from 0 to 1 second. Virtual acoustic techniques were used to simulate rooms with different degrees of reverberation. Vowel duration and correlation between amplitude envelope and fundamental frequency (F0) contour were analyzed for different tones as a function of the RT. RESULTS Vowel durations of different tones significantly increased with longer RTs. Amplitude-F0 correlation remained similar for the falling Tone 4 but greatly decreased for the other tones in reverberation. NH listeners had robust pitch-ranking, tone recognition, and vowel recognition performance as the RT increased. Reverberation significantly degraded CI users' pitch-ranking thresholds but did not significantly affect the overall scores of tone and vowel recognition with CIs. Detailed analyses of tone confusion matrices showed that CI users reduced the flat Tone-1 responses but increased the falling Tone-4 responses in reverberation, possibly due to the falling amplitude envelope of late reflections after the original vowel segment. CI users' tone recognition scores were not correlated with their pitch-ranking thresholds. CONCLUSIONS NH listeners can reliably recognize Mandarin tones in reverberation using salient pitch cues from spectral and temporal fine structures. However, CI users have poorer pitch perception using F0-related amplitude modulations that are reduced in reverberation. Reverberation distorts speech amplitude envelopes, which affect the distribution of tone responses but not the accuracy of tone recognition with CIs. Recognition of vowels with stationary formant trajectories is not affected by reverberation for both NH listeners and CI users, regardless of the available spectral resolution. Future studies should test how the relatively stable vowel and tone recognition may contribute to sentence recognition in reverberation of Mandarin-speaking CI users.
Collapse
|
39
|
Ginzburg J, Moulin A, Fornoni L, Talamini F, Tillmann B, Caclin A. Development of auditory cognition in 5- to 10-year-old children: Focus on musical and verbal short-term memory. Dev Sci 2021; 25:e13188. [PMID: 34751481 DOI: 10.1111/desc.13188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 10/20/2021] [Accepted: 10/22/2021] [Indexed: 11/29/2022]
Abstract
Developmental aspects of auditory cognition were investigated in 5-to-10-year-old children (n = 100). Musical and verbal short-term memory (STM) were assessed by means of delayed matching-to-sample tasks (DMST) (comparison of two four-item sequences separated by a silent retention delay), with two levels of difficulty. For musical and verbal materials, children's performance increased from 5 years to about 7 years of age, then remained stable up to 10 years of age, with performance remaining inferior to performance of young adults. Children and adults performed better with verbal material than with musical material. To investigate auditory cognition beyond STM, we assessed speech-in-noise perception with a four-alternative forced-choice task with two conditions of phonological difficulty and two levels of cocktail-party noise intensity. Partial correlations, factoring out the effect of age, showed a significant link between musical STM and speech-in-noise perception in the condition with increased noise intensity. Our findings reveal that auditory STM improves over development with a critical phase around 6-7 years of age, yet these abilities appear to be still immature at 10 years. Musical and verbal STM might in particular share procedural and serial order processes. Furthermore, musical STM and the ability to perceive relevant speech signals in cocktail-party noise might rely on shared cognitive resources, possibly related to pitch encoding. To the best of our knowledge, this is the first time that auditory STM is assessed with the same paradigm for musical and verbal material during childhood, providing perspectives regarding diagnosis and remediation in developmental learning disorders.
Collapse
Affiliation(s)
- Jérémie Ginzburg
- Lyon Neuroscience Research Center, UMR5292, INSERM, U1028, CNRS, Lyon, France.,University Lyon 1, Lyon, France
| | - Annie Moulin
- Lyon Neuroscience Research Center, UMR5292, INSERM, U1028, CNRS, Lyon, France.,University Lyon 1, Lyon, France
| | - Lesly Fornoni
- Lyon Neuroscience Research Center, UMR5292, INSERM, U1028, CNRS, Lyon, France.,University Lyon 1, Lyon, France
| | | | - Barbara Tillmann
- Lyon Neuroscience Research Center, UMR5292, INSERM, U1028, CNRS, Lyon, France.,University Lyon 1, Lyon, France
| | - Anne Caclin
- Lyon Neuroscience Research Center, UMR5292, INSERM, U1028, CNRS, Lyon, France.,University Lyon 1, Lyon, France
| |
Collapse
|
40
|
Flaherty MM, Browning J, Buss E, Leibold LJ. Effects of Hearing Loss on School-Aged Children's Ability to Benefit From F0 Differences Between Target and Masker Speech. Ear Hear 2021; 42:1084-1096. [PMID: 33538428 PMCID: PMC8222052 DOI: 10.1097/aud.0000000000000979] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The objectives of the study were to (1) evaluate the impact of hearing loss on children's ability to benefit from F0 differences between target/masker speech in the context of aided speech-in-speech recognition and (2) to determine whether compromised F0 discrimination associated with hearing loss predicts F0 benefit in individual children. We hypothesized that children wearing appropriately fitted amplification would benefit from F0 differences, but they would not show the same magnitude of benefit as children with normal hearing. Reduced audibility and poor suprathreshold encoding that degrades frequency discrimination were expected to impair children's ability to segregate talkers based on F0. DESIGN Listeners were 9 to 17 year olds with bilateral, symmetrical, sensorineural hearing loss ranging in degree from mild to severe. A four-alternative, forced-choice procedure was used to estimate thresholds for disyllabic word recognition in a 60-dB-SPL two-talker masker. The same male talker produced target and masker speech. Target words had either the same mean F0 as the masker or were digitally shifted higher than the masker by three, six, or nine semitones. The F0 benefit was defined as the difference in thresholds between the shifted-F0 conditions and the unshifted-F0 condition. Thresholds for discriminating F0 were also measured, using a three-alternative, three-interval forced choice procedure, to determine whether compromised sensitivity to F0 differences due to hearing loss would predict children's ability to benefit from F0. Testing was performed in the sound field, and all children wore their personal hearing aids at user settings. RESULTS Children with hearing loss benefited from an F0 difference of nine semitones between target words and masker speech, with older children generally benefitting more than younger children. Some children benefitted from an F0 difference of six semitones, but this was not consistent across listeners. Thresholds for discriminating F0 improved with increasing age and predicted F0 benefit in the nine-semitone condition. An exploratory analysis indicated that F0 benefit was not significantly correlated with the four-frequency pure-tone average (0.5, 1, 2, and 4 kHz), aided audibility, or consistency of daily hearing aid use, although there was a trend for an association with the low-frequency pure-tone average (0.25 and 0.5 kHz). Comparisons of the present data to our previous study of children with normal hearing demonstrated that children with hearing loss benefitted less than children with normal hearing for the F0 differences tested. CONCLUSIONS The results demonstrate that children with mild-to-severe hearing loss who wear hearing aids benefit from relatively large F0 differences between target and masker speech during aided speech-in-speech recognition. The size of the benefit increases with increasing age, consistent with previously reported age effects for children with normal hearing. However, hearing loss reduces children's ability to capitalize on F0 differences between talkers. Audibility alone does not appear to be responsible for this effect; aided audibility and degree of loss were not primary predictors of performance. The ability to benefit from F0 differences may be limited by immature central processing or aspects of peripheral encoding that are not characterized in standard clinical assessments.
Collapse
Affiliation(s)
- Mary M. Flaherty
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA
| | | | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Lori J. Leibold
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| |
Collapse
|
41
|
Kim S, Chou HH, Luo X. Mandarin tone recognition training with cochlear implant simulation: Amplitude envelope enhancement and cue weighting. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:1218. [PMID: 34470277 DOI: 10.1121/10.0005878] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 07/22/2021] [Indexed: 06/13/2023]
Abstract
With limited fundamental frequency (F0) cues, cochlear implant (CI) users recognize Mandarin tones using amplitude envelope. This study investigated whether tone recognition training with amplitude envelope enhancement may improve tone recognition and cue weighting with CIs. Three groups of CI-simulation listeners received training using vowels with amplitude envelope modified to resemble F0 contour (enhanced-amplitude-envelope training), training using natural vowels (natural-amplitude-envelope training), and exposure to natural vowels without training, respectively. Tone recognition with natural and enhanced amplitude envelope cues and cue weighting of amplitude envelope and F0 contour were measured in pre-, post-, and retention-tests. It was found that with similar pre-test performance, both training groups had better tone recognition than the no-training group after training. Only enhanced-amplitude-envelope training increased the benefits of amplitude envelope enhancement in the post- and retention-tests than in the pre-test. Neither training paradigm increased the cue weighting of amplitude envelope and F0 contour more than stimulus exposure. Listeners attending more to amplitude envelope in the pre-test tended to have better tone recognition with enhanced amplitude envelope cues before training and improve more in tone recognition after enhanced-amplitude-envelope training. The results suggest that auditory training and speech enhancement may bring maximum benefits to CI users when combined.
Collapse
Affiliation(s)
- Seeon Kim
- Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe, Arizona 85287, USA
| | - Hsiao-Hsiuan Chou
- Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe, Arizona 85287, USA
| | - Xin Luo
- Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe, Arizona 85287, USA
| |
Collapse
|
42
|
Zaltz Y, Goldsworthy RL, Eisenberg LS, Kishon-Rabin L. Children With Normal Hearing Are Efficient Users of Fundamental Frequency and Vocal Tract Length Cues for Voice Discrimination. Ear Hear 2021; 41:182-193. [PMID: 31107364 PMCID: PMC9371943 DOI: 10.1097/aud.0000000000000743] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND The ability to discriminate between talkers assists listeners in understanding speech in a multitalker environment. This ability has been shown to be influenced by sensory processing of vocal acoustic cues, such as fundamental frequency (F0) and formant frequencies that reflect the listener's vocal tract length (VTL), and by cognitive processes, such as attention and memory. It is, therefore, suggested that children who exhibit immature sensory and/or cognitive processing will demonstrate poor voice discrimination (VD) compared with young adults. Moreover, greater difficulties in VD may be associated with spectral degradation as in children with cochlear implants. OBJECTIVES The aim of this study was as follows: (1) to assess the use of F0 cues, VTL cues, and the combination of both cues for VD in normal-hearing (NH) school-age children and to compare their performance with that of NH adults; (2) to assess the influence of spectral degradation by means of vocoded speech on the use of F0 and VTL cues for VD in NH children; and (3) to assess the contribution of attention, working memory, and nonverbal reasoning to performance. DESIGN Forty-one children, 8 to 11 years of age, were tested with nonvocoded stimuli. Twenty-one of them were also tested with eight-channel, noise-vocoded stimuli. Twenty-one young adults (18 to 35 years) were tested for comparison. A three-interval, three-alternative forced-choice paradigm with an adaptive tracking procedure was used to estimate the difference limens (DLs) for VD when F0, VTL, and F0 + VTL were manipulated separately. Auditory memory, visual attention, and nonverbal reasoning were assessed for all participants. RESULTS (a) Children' F0 and VTL discrimination abilities were comparable to those of adults, suggesting that most school-age children utilize both cues effectively for VD. (b) Children's VD was associated with trail making test scores that assessed visual attention abilities and speed of processing, possibly reflecting their need to recruit cognitive resources for the task. (c) Best DLs were achieved for the combined (F0 + VTL) manipulation for both children and adults, suggesting that children at this age are already capable of integrating spectral and temporal cues. (d) Both children and adults found the VTL manipulations more beneficial for VD compared with the F0 manipulations, suggesting that formant frequencies are more reliable for identifying a specific speaker than F0. (e) Poorer DLs were achieved with the vocoded stimuli, though the children maintained similar thresholds and pattern of performance among manipulations as the adults. CONCLUSIONS The present study is the first to assess the contribution of F0, VTL, and the combined F0 + VTL to the discrimination of speakers in school-age children. The findings support the notion that many NH school-age children have effective spectral and temporal coding mechanisms that allow sufficient VD, even in the presence of spectrally degraded information. These results may challenge the notion that immature sensory processing underlies poor listening abilities in children, further implying that other processing mechanisms contribute to their difficulties to understand speech in a multitalker environment. These outcomes may also provide insight into VD processes of children under listening conditions that are similar to cochlear implant users.
Collapse
Affiliation(s)
- Yael Zaltz
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- University of Southern California Tina and Rick Caruso Department of Otolaryngology—Head & Neck Surgery, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA
| | - Raymond L. Goldsworthy
- University of Southern California Tina and Rick Caruso Department of Otolaryngology—Head & Neck Surgery, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA
| | - Laurie S. Eisenberg
- University of Southern California Tina and Rick Caruso Department of Otolaryngology—Head & Neck Surgery, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA
| | - Liat Kishon-Rabin
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
43
|
Abstract
OBJECTIVES Individuals with cochlear implants (CIs) show reduced word and auditory emotion recognition abilities relative to their peers with normal hearing. Modern CI processing strategies are designed to preserve acoustic cues requisite for word recognition rather than those cues required for accessing other signal information (e.g., talker gender or emotional state). While word recognition is undoubtedly important for communication, the inaccessibility of this additional signal information in speech may lead to negative social experiences and outcomes for individuals with hearing loss. This study aimed to evaluate whether the emphasis on word recognition preservation in CI processing has unintended consequences on the perception of other talker information, such as emotional state. DESIGN Twenty-four young adult listeners with normal hearing listened to sentences and either reported a target word in each sentence (word recognition task) or selected the emotion of the talker (emotion recognition task) from a list of options (Angry, Calm, Happy, and Sad). Sentences were blocked by task type (emotion recognition versus word recognition) and processing condition (unprocessed versus 8-channel noise vocoder) and presented randomly within the block at three signal-to-noise ratios (SNRs) in a background of speech-shaped noise. Confusion matrices showed the number of errors in emotion recognition by listeners. RESULTS Listeners demonstrated better emotion recognition performance than word recognition performance at the same SNR. Unprocessed speech resulted in higher recognition rates than vocoded stimuli. Recognition performance (for both words and emotions) decreased with worsening SNR. Vocoding speech resulted in a greater negative impact on emotion recognition than it did for word recognition. CONCLUSIONS These data confirm prior work that suggests that in background noise, emotional prosodic information in speech is easier to recognize than word information, even after simulated CI processing. However, emotion recognition may be more negatively impacted by background noise and CI processing than word recognition. Future work could explore CI processing strategies that better encode prosodic information and investigate this effect in individuals with CIs as opposed to vocoded simulation. This study emphasized the need for clinicians to consider not only word recognition but also other aspects of speech that are critical to successful social communication.
Collapse
|
44
|
Drew JA, Brimijoin WO. Negative impacts from latency masked by noise in simulated beamforming. PLoS One 2021; 16:e0254119. [PMID: 34197551 PMCID: PMC8248715 DOI: 10.1371/journal.pone.0254119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 06/20/2021] [Indexed: 11/18/2022] Open
Abstract
Those experiencing hearing loss face severe challenges in perceiving speech in noisy situations such as a busy restaurant or cafe. There are many factors contributing to this deficit including decreased audibility, reduced frequency resolution, and decline in temporal synchrony across the auditory system. Some hearing assistive devices implement beamforming in which multiple microphones are used in combination to attenuate surrounding noise while the target speaker is left unattenuated. In increasingly challenging auditory environments, more complex beamforming algorithms are required, which increases the processing time needed to provide a useful signal-to-noise ratio of the target speech. This study investigated whether the benefits from signal enhancement from beamforming are outweighed by the negative impacts on perception from an increase in latency between the direct acoustic signal and the digitally enhanced signal. The hypothesis for this study is that an increase in latency between the two identical speech signals would decrease intelligibility of the speech signal. Using 3 gain / latency pairs from a beamforming simulation previously completed in lab, perceptual thresholds of SNR from a simulated use case were obtained from normal hearing participants. No significant differences were detected between the 3 conditions. When presented with 2 copies of the same speech signal presented at varying gain / latency pairs in a noisy environment, any negative intelligibility effects from latency are masked by the noise. These results allow for more lenient restrictions for limiting processing delays in hearing assistive devices.
Collapse
Affiliation(s)
- Jordan A. Drew
- Facebook AR/VR - Audio, Redmond, Washington, United States of America
- Department of Electrical and Computer Engineering, University of Washington, Seattle, Washington, United States of America
| | - W. Owen Brimijoin
- Facebook AR/VR - Audio, Redmond, Washington, United States of America
| |
Collapse
|
45
|
Easwar V, Boothalingam S, Flaherty R. Fundamental frequency-dependent changes in vowel-evoked envelope following responses. Hear Res 2021; 408:108297. [PMID: 34229221 DOI: 10.1016/j.heares.2021.108297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Revised: 06/03/2021] [Accepted: 06/09/2021] [Indexed: 10/21/2022]
Abstract
Scalp-recorded envelope following responses (EFRs) provide a non-invasive method to assess the encoding of the fundamental frequency (f0) of voice that is important for speech understanding. It is well-known that EFRs are influenced by voice f0. However, this effect of f0 has not been examined independent of concomitant changes in spectra or neural generators. We evaluated the effect of voice f0 on EFRs while controlling for vowel formant characteristics and potentially avoiding significant changes in dominant neural generators using a small f0 range. EFRs were elicited by a male-spoken vowel /u/ (average f0 = 100.4 Hz) and its lowered f0 version (average f0 = 91.9 Hz) with closely matched formant characteristics. Vowels were presented to each ear of 17 young adults with normal hearing. EFRs were simultaneously recorded between the vertex and the nape, and the vertex and the ipsilateral mastoid-the two most common electrode montages used for EFRs. Our results indicate that when vowel formant characteristics are matched, an increase in f0 by 8.5 Hz reduces EFR amplitude by 25 nV, phase coherence by 0.05 and signal-to-noise ratio by 3.5 dB, on average. The reduction in EFR characteristics was similar across ears of stimulation and the two montages used. These findings will help parse the influence of f0 or stimulus spectra on EFRs when both co-vary.
Collapse
Affiliation(s)
- Vijayalakshmi Easwar
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, United States; Waisman Center, University of Wisconsin-Madison, United States
| | - Sriram Boothalingam
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, United States; Waisman Center, University of Wisconsin-Madison, United States
| | - Regan Flaherty
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, United States; Waisman Center, University of Wisconsin-Madison, United States
| |
Collapse
|
46
|
Johnson KC, Xie Z, Shader MJ, Mayo PG, Goupell MJ. Effect of Chronological Age on Pulse Rate Discrimination in Adult Cochlear-Implant Users. Trends Hear 2021; 25:23312165211007367. [PMID: 34028313 PMCID: PMC8150454 DOI: 10.1177/23312165211007367] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Cochlear-implant (CI) users rely heavily on temporal envelope cues to understand speech. Temporal processing abilities may decline with advancing age in adult CI users. This study investigated the effect of age on the ability to discriminate changes in pulse rate. Twenty CI users aged 23 to 80 years participated in a rate discrimination task. They attempted to discriminate a 35% rate increase from baseline rates of 100, 200, 300, 400, or 500 pulses per second. The stimuli were electrical pulse trains delivered to a single electrode via direct stimulation to an apical (Electrode 20), a middle (Electrode 12), or a basal location (Electrode 4). Electrically evoked compound action potential amplitude growth functions were recorded at each of those electrodes as an estimate of peripheral neural survival. Results showed that temporal pulse rate discrimination performance declined with advancing age at higher stimulation rates (e.g., 500 pulses per second) when compared with lower rates. The age-related changes in temporal pulse rate discrimination at higher stimulation rates persisted after statistical analysis to account for the estimated peripheral contributions from electrically evoked compound action potential amplitude growth functions. These results indicate the potential contributions of central factors to the limitations in temporal pulse rate discrimination ability associated with aging in CI users.
Collapse
Affiliation(s)
- Kelly C Johnson
- Department of Hearing and Speech Sciences, University of Maryland, College Park, United States
| | - Zilong Xie
- Department of Hearing and Speech, University of Kansas Medical Center, Kansas City, United States
| | - Maureen J Shader
- Department of Hearing and Speech Sciences, University of Maryland, College Park, United States.,Bionics Institute, Melbourne, Australia.,Department of Medical Bionics, The University of Melbourne, Melbourne, Australia
| | - Paul G Mayo
- Department of Hearing and Speech Sciences, University of Maryland, College Park, United States
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, United States
| |
Collapse
|
47
|
Rapid Assessment of Non-Verbal Auditory Perception in Normal-Hearing Participants and Cochlear Implant Users. J Clin Med 2021; 10:jcm10102093. [PMID: 34068067 PMCID: PMC8152499 DOI: 10.3390/jcm10102093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 04/26/2021] [Accepted: 05/06/2021] [Indexed: 01/17/2023] Open
Abstract
In the case of hearing loss, cochlear implants (CI) allow for the restoration of hearing. Despite the advantages of CIs for speech perception, CI users still complain about their poor perception of their auditory environment. Aiming to assess non-verbal auditory perception in CI users, we developed five listening tests. These tests measure pitch change detection, pitch direction identification, pitch short-term memory, auditory stream segregation, and emotional prosody recognition, along with perceived intensity ratings. In order to test the potential benefit of visual cues for pitch processing, the three pitch tests included half of the trials with visual indications to perform the task. We tested 10 normal-hearing (NH) participants with material being presented as original and vocoded sounds, and 10 post-lingually deaf CI users. With the vocoded sounds, the NH participants had reduced scores for the detection of small pitch differences, and reduced emotion recognition and streaming abilities compared to the original sounds. Similarly, the CI users had deficits for small differences in the pitch change detection task and emotion recognition, as well as a decreased streaming capacity. Overall, this assessment allows for the rapid detection of specific patterns of non-verbal auditory perception deficits. The current findings also open new perspectives about how to enhance pitch perception capacities using visual cues.
Collapse
|
48
|
Elmahallawi TH, Gabr TA, Darwish ME, Seleem FM. Children with developmental language disorder: a frequency following response in the noise study. Braz J Otorhinolaryngol 2021; 88:954-961. [PMID: 33766501 PMCID: PMC9615520 DOI: 10.1016/j.bjorl.2021.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 12/21/2020] [Accepted: 01/31/2021] [Indexed: 11/27/2022] Open
Abstract
Introduction Children with developmental language disorder have been reported to have poor temporal auditory processing. This study aimed to examine the frequency following response. Objective This work aimed to investigate speech processing in quiet and in noise. Methods Two groups of children were included in this work: the control group (15 children with normal language development) and the study group (25 children diagnosed with developmental language disorder). All children were submitted to intelligence scale, language assessment, full audiological evaluation, and frequency following response in quiet and noise (+5QNR and +10QNR). Results Results showed no statically significant difference between both groups as regards IQ or PTA. In the study group, the advanced analysis of frequency following response showed reduced F0 and F2 amplitudes. Results also showed that noise has an impact on both the transient and sustained components of the frequency following response in the same group. Conclusion Children with developmental language disorder have difficulty in speech processing especially in the presence of background noise. Frequency following response is an efficient procedure that can be used to address speech processing problems in children with developmental language disorder.
Collapse
Affiliation(s)
- Trandil H Elmahallawi
- Tanta University Hospitals, Otolaryngology Head and Neck Surgery Department, Audiovestibular Unit, Tanta, Egypt
| | - Takwa A Gabr
- Kafrelsheikh University Hospitals, Otolaryngology Head and Neck Surgery Department, Audiovestibular Unit, Kafrelsheikh, Egypt.
| | - Mohamed E Darwish
- Tanta University Hospitals, Otolaryngology Head and Neck Surgery Department, Phoniatrics Unit, Tanta, Egypt
| | - Fatma M Seleem
- Tanta University Hospitals, Otolaryngology Head and Neck Surgery Department, Audiovestibular Unit, Tanta, Egypt
| |
Collapse
|
49
|
Weighting of Prosodic and Lexical-Semantic Cues for Emotion Identification in Spectrally Degraded Speech and With Cochlear Implants. Ear Hear 2021; 42:1727-1740. [PMID: 34294630 PMCID: PMC8545870 DOI: 10.1097/aud.0000000000001057] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Normally-hearing (NH) listeners rely more on prosodic cues than on lexical-semantic cues for emotion perception in speech. In everyday spoken communication, the ability to decipher conflicting information between prosodic and lexical-semantic cues to emotion can be important: for example, in identifying sarcasm or irony. Speech degradation in cochlear implants (CIs) can be sufficiently overcome to identify lexical-semantic cues, but the distortion of voice pitch cues makes it particularly challenging to hear prosody with CIs. The purpose of this study was to examine changes in relative reliance on prosodic and lexical-semantic cues in NH adults listening to spectrally degraded speech and adult CI users. We hypothesized that, compared with NH counterparts, CI users would show increased reliance on lexical-semantic cues and reduced reliance on prosodic cues for emotion perception. We predicted that NH listeners would show a similar pattern when listening to CI-simulated versions of emotional speech. DESIGN Sixteen NH adults and 8 postlingually deafened adult CI users participated in the study. Sentences were created to convey five lexical-semantic emotions (angry, happy, neutral, sad, and scared), with five sentences expressing each category of emotion. Each of these 25 sentences was then recorded with the 5 (angry, happy, neutral, sad, and scared) prosodic emotions by 2 adult female talkers. The resulting stimulus set included 125 recordings (25 Sentences × 5 Prosodic Emotions) per talker, of which 25 were congruent (consistent lexical-semantic and prosodic cues to emotion) and the remaining 100 were incongruent (conflicting lexical-semantic and prosodic cues to emotion). The recordings were processed to have 3 levels of spectral degradation: full-spectrum, CI-simulated (noise-vocoded) to have 8 channels and 16 channels of spectral information, respectively. Twenty-five recordings (one sentence per lexical-semantic emotion recorded in all five prosodies) were used for a practice run in the full-spectrum condition. The remaining 100 recordings were used as test stimuli. For each talker and condition of spectral degradation, listeners indicated the emotion associated with each recording in a single-interval, five-alternative forced-choice task. The responses were scored as proportion correct, where "correct" responses corresponded to the lexical-semantic emotion. CI users heard only the full-spectrum condition. RESULTS The results showed a significant interaction between hearing status (NH, CI) and congruency in identifying the lexical-semantic emotion associated with the stimuli. This interaction was as predicted, that is, CI users showed increased reliance on lexical-semantic cues in the incongruent conditions, while NH listeners showed increased reliance on the prosodic cues in the incongruent conditions. As predicted, NH listeners showed increased reliance on lexical-semantic cues to emotion when the stimuli were spectrally degraded. CONCLUSIONS The present study confirmed previous findings of prosodic dominance for emotion perception by NH listeners in the full-spectrum condition. Further, novel findings with CI patients and NH listeners in the CI-simulated conditions showed reduced reliance on prosodic cues and increased reliance on lexical-semantic cues to emotion. These results have implications for CI listeners' ability to perceive conflicts between prosodic and lexical-semantic cues, with repercussions for their identification of sarcasm and humor. Understanding instances of sarcasm or humor can impact a person's ability to develop relationships, follow conversation, understand vocal emotion and intended message of a speaker, following jokes, and everyday communication in general.
Collapse
|
50
|
Nogueira W, Boghdady NE, Langner F, Gaudrain E, Başkent D. Effect of Channel Interaction on Vocal Cue Perception in Cochlear Implant Users. Trends Hear 2021; 25:23312165211030166. [PMID: 34461780 PMCID: PMC8411629 DOI: 10.1177/23312165211030166] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 06/14/2021] [Accepted: 06/16/2021] [Indexed: 11/16/2022] Open
Abstract
Speech intelligibility in multitalker settings is challenging for most cochlear implant (CI) users. One possibility for this limitation is the suboptimal representation of vocal cues in implant processing, such as the fundamental frequency (F0), and the vocal tract length (VTL). Previous studies suggested that while F0 perception depends on spectrotemporal cues, VTL perception relies largely on spectral cues. To investigate how spectral smearing in CIs affects vocal cue perception in speech-on-speech (SoS) settings, adjacent electrodes were simultaneously stimulated using current steering in 12 Advanced Bionics users to simulate channel interaction. In current steering, two adjacent electrodes are simultaneously stimulated forming a channel of parallel stimulation. Three such stimulation patterns were used: Sequential (one current steering channel), Paired (two channels), and Triplet stimulation (three channels). F0 and VTL just-noticeable differences (JNDs; Task 1), in addition to SoS intelligibility (Task 2) and comprehension (Task 3), were measured for each stimulation strategy. In Tasks 2 and 3, four maskers were used: the same female talker, a male voice obtained by manipulating both F0 and VTL (F0+VTL) of the original female speaker, a voice where only F0 was manipulated, and a voice where only VTL was manipulated. JNDs were measured relative to the original voice for the F0, VTL, and F0+VTL manipulations. When spectral smearing was increased from Sequential to Triplet, a significant deterioration in performance was observed for Tasks 1 and 2, with no differences between Sequential and Paired stimulation. Data from Task 3 were inconclusive. These results imply that CI users may tolerate certain amounts of channel interaction without significant reduction in performance on tasks relying on voice perception. This points to possibilities for using parallel stimulation in CIs for reducing power consumption.
Collapse
Affiliation(s)
- Waldo Nogueira
- Department of Otolaryngology, Medical University
Hannover and Cluster of Excellence Hearing4all, Hanover, Germany
| | - Nawal El Boghdady
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
| | - Florian Langner
- Department of Otolaryngology, Medical University
Hannover and Cluster of Excellence Hearing4all, Hanover, Germany
| | - Etienne Gaudrain
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
- Lyon Neuroscience Research Center, CNRS UMR 5292,
INSERM U1028, University Lyon 1, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
| |
Collapse
|