1
|
Srinivasan N, Patro C, Kansangra R, Trotman A. Comparison of Psychometric Functions Measured Using Remote Testing and Laboratory Testing. Audiol Res 2024; 14:469-478. [PMID: 38804463 PMCID: PMC11130947 DOI: 10.3390/audiolres14030039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 05/17/2024] [Accepted: 05/18/2024] [Indexed: 05/29/2024] Open
Abstract
The use of remote testing to collect behavioral data has been on the rise, especially after the COVID-19 pandemic. Here we present psychometric functions for a commonly used speech corpus obtained in remote testing and laboratory testing conditions on young normal hearing listeners in the presence of different types of maskers. Headphone use for the remote testing group was checked by supplementing procedures from prior literature using a Huggins pitch task. Results revealed no significant differences in the measured thresholds using the remote testing and laboratory testing conditions for all the three masker types. Also, the thresholds measured obtained in these two conditions were strongly correlated for a different group of young normal hearing listeners. Based on the results, excellent outcomes on auditory threshold measurements where the stimuli are presented both at levels lower than and above an individual's speech-recognition threshold can be obtained by remotely testing the listeners.
Collapse
Affiliation(s)
- Nirmal Srinivasan
- Department of Speech-Language Pathology and Audiology, Towson University, Towson, MD 21252, USA; (C.P.); (R.K.); (A.T.)
| | | | | | | |
Collapse
|
2
|
Wang J, Xie S, Stenfelt S, Zhou H, Wang X, Sang J. Spatial Release From Masking With Bilateral Bone Conduction Stimulation at Mastoid for Normal Hearing Subjects. Trends Hear 2024; 28:23312165241234202. [PMID: 38549451 PMCID: PMC10981249 DOI: 10.1177/23312165241234202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 02/03/2024] [Accepted: 02/05/2024] [Indexed: 04/01/2024] Open
Abstract
This study investigates the effect of spatial release from masking (SRM) in bilateral bone conduction (BC) stimulation at the mastoid. Nine adults with normal hearing were tested to determine SRM based on speech recognition thresholds (SRTs) in simulated spatial configurations ranging from 0 to 180 degrees. These configurations were based on nonindividualized head-related transfer functions. The participants were subjected to sound stimulation through either air conduction (AC) via headphones or BC. The results indicated that both the angular separation between the target and the masker, and the modality of sound stimulation, significantly influenced speech recognition performance. As the angular separation between the target and the masker increased up to 150°, both BC and AC SRTs decreased, indicating improved performance. However, performance slightly deteriorated when the angular separation exceeded 150°. For spatial separations less than 75°, BC stimulation provided greater spatial benefits than AC, although this difference was not statistically significant. For separations greater than 75°, AC stimulation offered significantly more spatial benefits than BC. When speech and noise originated from the same side of the head, the "better ear effect" did not significantly contribute to SRM. However, when speech and noise were located on opposite sides of the head, this effect became dominant in SRM.
Collapse
Affiliation(s)
- Jie Wang
- School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, China
| | - Sijia Xie
- School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, China
| | - Stefan Stenfelt
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
| | - Huali Zhou
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, China
| | - Xiaoya Wang
- Otolaryngology Department, Guangzhou Women and Children's Medical Center, Guangzhou, China
| | - Jinqiu Sang
- Shanghai Institute of AI for Education, East China Normal University, Shanghai, China
| |
Collapse
|
3
|
Byrne AJ, Conroy C, Kidd G. Individual differences in speech-on-speech masking are correlated with cognitive and visual task performance. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:2137-2153. [PMID: 37800988 PMCID: PMC10631817 DOI: 10.1121/10.0021301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 07/19/2023] [Accepted: 09/17/2023] [Indexed: 10/07/2023]
Abstract
Individual differences in spatial tuning for masked target speech identification were determined using maskers that varied in type and proximity to the target source. The maskers were chosen to produce three strengths of informational masking (IM): high [same-gender, speech-on-speech (SOS) masking], intermediate (the same masker speech time-reversed), and low (speech-shaped, speech-envelope-modulated noise). Typical for this task, individual differences increased as IM increased, while overall performance decreased. To determine the extent to which auditory performance might generalize to another sensory modality, a comparison visual task was also implemented. Visual search time was measured for identifying a cued object among "clouds" of distractors that were varied symmetrically in proximity to the target. The visual maskers also were chosen to produce three strengths of an analog of IM based on feature similarities between the target and maskers. Significant correlations were found for overall auditory and visual task performance, and both of these measures were correlated with an index of general cognitive reasoning. Overall, the findings provide qualified support for the proposition that the ability of an individual to solve IM-dominated tasks depends on cognitive mechanisms that operate in common across sensory modalities.
Collapse
Affiliation(s)
- Andrew J Byrne
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| | - Christopher Conroy
- Department of Biological and Vision Sciences, State University of New York College of Optometry, New York, New York 10036, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
4
|
Thompson NJ, Brown KD, Buss E, Rooth MA, Richter ME, Dillon MT. Long-Term Binaural Hearing Improvements for Cochlear Implant Users with Asymmetric Hearing Loss. Laryngoscope 2023; 133:1480-1485. [PMID: 36053850 DOI: 10.1002/lary.30368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 06/30/2022] [Accepted: 08/09/2022] [Indexed: 11/08/2022]
Abstract
OBJECTIVE To assess long-term binaural hearing abilities for cochlear implant (CI) users with unilateral hearing loss (UHL) or asymmetric hearing loss (AHL). METHODS A prospective, longitudinal, repeated measures study was completed at a tertiary referral center evaluating adults with UHL or AHL undergoing cochlear implantation. Binaural hearing abilities were assessed with masked speech recognition tasks using AzBio sentences in a 10-talker masker. Performance was evaluated as the ability to benefit from spatial release from masking (SRM). SRM was calculated as the difference in scores when the masker was presented toward the CI-ear (SRMci ) or the contralateral ear (SRMcontra ) relative to the co-located condition (0°). Assessments were completed pre-operatively and at annual intervals out to 5 years post-activation. RESULTS Twenty UHL and 19 AHL participants were included in the study. Linear Mixed Models showed significant main effects of interval and group for SRMcontra . There was a significant interaction between interval and group, with UHL participants reaching asymptotic performance early and AHL participants demonstrating continued growth in binaural abilities to 5 years post-activation. The improvement in SRM showed a significant positive correlation with contralateral unaided hearing thresholds (p = 0.050) as well as age at implantation (p = 0.031). CONCLUSIONS CI recipients with UHL and AHL showed improved SRM with long-term device use. The time course of improvement varied by cohort, with the UHL cohort reaching asymptotic performance early and the AHL cohort continuing to improve beyond 1 year. Differences between cohorts could be driven by differences in age at implantation as well as contralateral unaided hearing thresholds. LEVEL OF EVIDENCE 3 Laryngoscope, 133:1480-1485, 2023.
Collapse
Affiliation(s)
- Nicholas J Thompson
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Kevin D Brown
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Emily Buss
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Meredith A Rooth
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Margaret E Richter
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
- Division of Speech and Hearing Sciences, Department of Allied Health Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Margaret T Dillon
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
- Division of Speech and Hearing Sciences, Department of Allied Health Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
5
|
Lelo de Larrea-Mancera ES, Solís-Vivanco R, Sánchez-Jimenez Y, Coco L, Gallun FJ, Seitz AR. Development and validation of a Spanish-language spatial release from masking task in a Mexican population. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:316. [PMID: 36732214 PMCID: PMC10162838 DOI: 10.1121/10.0016850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
This study validates a new Spanish-language version of the Coordinate Response Measure (CRM) corpus using a well-established measure of spatial release from masking (SRM). Participants were 96 Spanish-speaking young adults without hearing complaints in Mexico City. To present the Spanish-language SRM test, we created new recordings of the CRM with Spanish-language Translations and updated the freely available app (PART; https://ucrbraingamecenter.github.io/PART_Utilities/) to present materials in Spanish. In addition to SRM, we collected baseline data on a battery of non-speech auditory assessments, including detection of frequency modulations, temporal gaps, and modulated broadband noise in the temporal, spectral, and spectrotemporal domains. Data demonstrate that the newly developed speech and non-speech tasks show similar reliability to an earlier report in English-speaking populations. This study demonstrates an approach by which auditory assessment for clinical and basic research can be extended to Spanish-speaking populations for whom testing platforms are not currently available.
Collapse
Affiliation(s)
| | - Rodolfo Solís-Vivanco
- Laboratory of Cognitive and Clinical Neurophysiology, Instituto Nacional de Neurología y Neurocirugía Manuel Velasco Suárez (INNNMVS), Avenue Insurgentes Sur 3877, La Fama, Tlalpan, Mexico City, CDMX 14269, Mexico
| | | | - Laura Coco
- Department of Otolaryngology, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Frederick J Gallun
- Department of Otolaryngology, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Aaron R Seitz
- Department of Psychology, University of California, 900 University Avenue, Riverside, California 92507, USA
| |
Collapse
|
6
|
Zenke K, Rosen S. Spatial release of masking in children and adults in non-individualized virtual environments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3384. [PMID: 36586845 DOI: 10.1121/10.0016360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 11/14/2022] [Indexed: 06/17/2023]
Abstract
The spatial release of masking (SRM) is often measured in virtual auditory environments created from head-related transfer functions (HRTFs) of a standardized adult head. Adults and children, however, differ in head dimensions and mismatched HRTFs are known to affect some aspects of binaural hearing. So far, there has been little research on HRTFs in children and it is unclear whether a large mismatch of spatial cues can degrade speech perception in complex environments. In two studies, the effect of non-individualized virtual environments on SRM accuracy in adults and children was examined. The SRMs were measured in virtual environments created from individual and non-individualized HRTFs and the equivalent real anechoic environment. Speech reception thresholds (SRTs) were measured for frontal target sentences and symmetrical speech maskers at 0° or ±90° azimuth. No significant difference between environments was observed for adults. In 7 to 12-year-old children, SRTs and SRMs improved with age, with SRMs approaching adult levels. SRTs differed slightly between environments and were significantly worse in a virtual environment based on HRTFs from a spherical head. Adult HRTFs seem sufficient to accurately measure SRTs in children even in complex listening conditions.
Collapse
Affiliation(s)
- Katharina Zenke
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London, WC1N 1PF, United Kingdom
| | - Stuart Rosen
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London, WC1N 1PF, United Kingdom
| |
Collapse
|
7
|
Ozmeral EJ, Higgins NC. Defining functional spatial boundaries using a spatial release from masking task. JASA EXPRESS LETTERS 2022; 2:124402. [PMID: 36586966 PMCID: PMC9720634 DOI: 10.1121/10.0015356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 11/11/2022] [Indexed: 06/17/2023]
Abstract
The classic spatial release from masking (SRM) task measures speech recognition thresholds for discrete separation angles between a target and masker. Alternatively, this study used a modified SRM task that adaptively measured the spatial-separation angle needed between a continuous male target stream (speech with digits) and two female masker streams to achieve a specific SRM. On average, 20 young normal-hearing listeners needed less spatial separation for 6 dB release than 9 dB release, and the presence of background babble reduced across-listener variability on the paradigm. Future work is needed to better understand the psychometric properties of this adaptive procedure.
Collapse
Affiliation(s)
- Erol J Ozmeral
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA ,
| | - Nathan C Higgins
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA ,
| |
Collapse
|
8
|
Denanto FM, Wales J, Tideholm B, Asp F. Differing Bilateral Benefits for Spatial Release From Masking and Sound Localization Accuracy Using Bone Conduction Devices. Ear Hear 2022; 43:1708-1720. [PMID: 35588503 PMCID: PMC9592172 DOI: 10.1097/aud.0000000000001234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 04/07/2022] [Indexed: 02/04/2023]
Abstract
OBJECTIVES Normal binaural hearing facilitates spatial hearing and therefore many everyday listening tasks, such as understanding speech against a backdrop of competing sounds originating from various locations, and localization of sounds. For stimulation with bone conduction hearing devices (BCD), used to alleviate conductive hearing losses, limited transcranial attenuation results in cross-stimulation so that both cochleae are stimulated from the position of the bone conduction transducer. As such, interaural time and level differences, hallmarks of binaural hearing, are unpredictable at the level of the inner ears. The aim of this study was to compare spatial hearing by unilateral and bilateral BCD stimulation in normal-hearing listeners with simulated bilateral conductive hearing loss. DESIGN Bilateral conductive hearing loss was reversibly induced in 25 subjects (mean age = 28.5 years) with air conduction and bone conduction (BC) pure-tone averages across 0.5, 1, 2, and 4 kHz (PTA 4 ) <5 dB HL. The mean (SD) PTA 4 for the simulated conductive hearing loss was 48.2 dB (3.8 dB). Subjects participated in a speech-in-speech task and a horizontal sound localization task in a within-subject repeated measures design (unilateral and bilateral bone conduction stimulation) using Baha 5 clinical sound processors on a softband. For the speech-in-speech task, the main outcome measure was the threshold for 40% correct speech recognition when masking speech and target speech were both colocated (0°) and spatially and symmetrically separated (target 0°, maskers ±30° and ±150°). Spatial release from masking was quantified as the difference between colocated and separated masking and target speech thresholds. For the localization task, the main outcome measure was the overall variance in localization accuracy quantified as an error index (0.0 = perfect performance; 1.0 = random performance). Four stimuli providing various spatial cues were used in the sound localization task. RESULTS The bilateral BCD benefit for recognition thresholds of speech in competing speech was statistically significant but small regardless if the masking speech signals were colocated with, or spatially and symmetrically separated from, the target speech. Spatial release from masking was identical for unilateral and bilateral conditions, and significantly different from zero. A distinct bilateral BCD sound localization benefit existed but varied in magnitude across stimuli. The smallest benefit occurred for a low-frequency stimulus (octave-filtered noise, CF = 0.5 kHz), and the largest benefit occurred for unmodulated broadband and narrowband (octave-filtered noise, CF = 4.0 kHz) stimuli. Sound localization by unilateral BCD was poor across stimuli. CONCLUSIONS Results suggest that the well-known transcranial transmission of BC sound affects bilateral BCD benefits for spatial processing of sound in differing ways. Results further suggest that patients with bilateral conductive hearing loss and BC thresholds within the normal range may benefit from a bilateral fitting of BCD, particularly for horizontal localization of sounds.
Collapse
Affiliation(s)
- Fatima M. Denanto
- Division of Ear, Nose and Throat Diseases, Department of Clinical Science, Intervention and Technology, Karolinska Institutet, Stockholm, Sweden
- Karolinska University Hospital, Stockholm, Sweden
| | - Jeremy Wales
- Division of Ear, Nose and Throat Diseases, Department of Clinical Science, Intervention and Technology, Karolinska Institutet, Stockholm, Sweden
- Karolinska University Hospital, Stockholm, Sweden
| | - Bo Tideholm
- Division of Ear, Nose and Throat Diseases, Department of Clinical Science, Intervention and Technology, Karolinska Institutet, Stockholm, Sweden
- Division of Surgery, County Hospital, Nykoping, Sweden
| | - Filip Asp
- Division of Ear, Nose and Throat Diseases, Department of Clinical Science, Intervention and Technology, Karolinska Institutet, Stockholm, Sweden
- Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|
9
|
Anderson KM, Buss E, Rooth MA, Richter ME, Overton AB, Brown KD, Dillon MT. Masked Speech Recognition as a Function of Masker Location for Cochlear Implant Users With Single-Sided Deafness. Am J Audiol 2022; 31:757-763. [PMID: 35877957 DOI: 10.1044/2022_aja-21-00268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE Cochlear implant (CI) recipients with normal or near normal hearing (NH) in the contralateral ear, referred to as single-sided deafness (SSD), experience significantly better speech recognition in noise with their CI than without it, although reported outcomes vary. One possible explanation for differences in outcomes across studies could be differences in the spatial configurations used to assess performance. This study compared speech recognition for different spatial configurations of the target and masker, with test materials used clinically. METHOD Sixteen CI users with SSD completed tasks of masked speech recognition presented in five spatial configurations. The target speech was presented from the front speaker (0° azimuth). The masker was located either 90° or 45° toward the CI-ear or NH-ear or colocated with the target. Materials were the AzBio sentences in a 10-talker masker and the Bamford-Kowal-Bench Speech-in-Noise test (BKB-SIN; four-talker masker). Spatial release from masking (SRM) was computed as the benefit associated with spatial separation relative to the colocated condition. RESULTS Performance was significantly better when the masker was separated toward the CI-ear as compared to colocated. No benefit was observed for spatial separations toward the NH-ear. The magnitude of SRM for spatial separations toward the CI-ear was similar for 45° and 90° when tested with the AzBio sentences, but a larger benefit was observed for 90° as compared to 45° for the BKB-SIN. CONCLUSIONS Masked speech recognition in CI users with SSD varies as a function of the spatial configuration of the target and masker. Results supported an expansion of the clinical test battery at the study site to assess binaural hearing abilities for CI candidates and recipients with SSD. The revised test battery presents the target from the front speaker and the masker colocated with the target, 90° toward the CI-ear, or 90° toward the NH-ear.
Collapse
Affiliation(s)
- Kelly M Anderson
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill.,Division of Speech and Hearing Sciences, Department of Allied Health Sciences, University of North Carolina at Chapel Hill
| | - Emily Buss
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
| | - Meredith A Rooth
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
| | - Margaret E Richter
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
| | | | - Kevin D Brown
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
| | - Margaret T Dillon
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill.,Division of Speech and Hearing Sciences, Department of Allied Health Sciences, University of North Carolina at Chapel Hill
| |
Collapse
|
10
|
Gallun FJ, Coco L, Koerner TK, de Larrea-Mancera ESL, Molis MR, Eddins DA, Seitz AR. Relating Suprathreshold Auditory Processing Abilities to Speech Understanding in Competition. Brain Sci 2022; 12:brainsci12060695. [PMID: 35741581 PMCID: PMC9221421 DOI: 10.3390/brainsci12060695] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 05/17/2022] [Accepted: 05/25/2022] [Indexed: 11/28/2022] Open
Abstract
(1) Background: Difficulty hearing in noise is exacerbated in older adults. Older adults are more likely to have audiometric hearing loss, although some individuals with normal pure-tone audiograms also have difficulty perceiving speech in noise. Additional variables also likely account for speech understanding in noise. It has been suggested that one important class of variables is the ability to process auditory information once it has been detected. Here, we tested a set of these “suprathreshold” auditory processing abilities and related them to performance on a two-part test of speech understanding in competition with and without spatial separation of the target and masking speech. Testing was administered in the Portable Automated Rapid Testing (PART) application developed by our team; PART facilitates psychoacoustic assessments of auditory processing. (2) Methods: Forty-one individuals (average age 51 years), completed assessments of sensitivity to temporal fine structure (TFS) and spectrotemporal modulation (STM) detection via an iPad running the PART application. Statistical models were used to evaluate the strength of associations between performance on the auditory processing tasks and speech understanding in competition. Age and pure-tone-average (PTA) were also included as potential predictors. (3) Results: The model providing the best fit also included age and a measure of diotic frequency modulation (FM) detection but none of the other potential predictors. However, even the best fitting models accounted for 31% or less of the variance, supporting work suggesting that other variables (e.g., cognitive processing abilities) also contribute significantly to speech understanding in noise. (4) Conclusions: The results of the current study do not provide strong support for previous suggestions that suprathreshold processing abilities alone can be used to explain difficulties in speech understanding in competition among older adults. This discrepancy could be due to the speech tests used, the listeners tested, or the suprathreshold tests chosen. Future work with larger numbers of participants is warranted, including a range of cognitive tests and additional assessments of suprathreshold auditory processing abilities.
Collapse
Affiliation(s)
- Frederick J. Gallun
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (L.C.); (T.K.K.)
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
- Correspondence: ; Tel.: +1-503-494-4331
| | - Laura Coco
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (L.C.); (T.K.K.)
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
| | - Tess K. Koerner
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (L.C.); (T.K.K.)
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
| | | | - Michelle R. Molis
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
| | - David A. Eddins
- Department of Communication Science & Disorders, University of South Florida, Tampa, FL 33620, USA;
| | - Aaron R. Seitz
- Department of Psychology, University of California, Riverside, CA 92521, USA; (E.S.L.d.L.-M.); (A.R.S.)
| |
Collapse
|
11
|
Abstract
Identification of speech from a "target" talker was measured in a speech-on-speech
masking task with two simultaneous "masker" talkers. The overall level of each talker was
either fixed or randomized throughout each stimulus presentation to investigate the
effectiveness of level as a cue for segregating competing talkers and attending to the
target. Experimental manipulations included varying the level difference between talkers
and imposing three types of target level uncertainty: 1) fixed target level across trials,
2) random target level across trials, or 3) random target levels on a word-by-word basis
within a trial. When the target level was predictable performance was better than
corresponding conditions when the target level was uncertain. Masker confusions were
consistent with a high degree of informational masking (IM). Furthermore, evidence was
found for "tuning" in level and a level "release" from IM. These findings suggest that
conforming to listener expectation about relative level, in addition to cues signaling
talker identity, facilitates segregation of, and maintaining focus of attention on, a
specific talker in multiple-talker communication situations.
Collapse
Affiliation(s)
- Andrew J Byrne
- Department of Speech, Language, & Hearing Sciences, 1846Boston University, MA, USA
| | - Christopher Conroy
- Department of Speech, Language, & Hearing Sciences, 1846Boston University, MA, USA
| | - Gerald Kidd
- Department of Speech, Language, & Hearing Sciences, 1846Boston University, MA, USA.,Department of Otolaryngology, Head-Neck Surgery, Medical University of South Carolina, Charleston, SC, USA
| |
Collapse
|
12
|
Theodoroff SM, Gallun FJ, McMillan GP, Molis M, Srinivasan N, Gordon J, McDermott D, Konrad-Martin D. Impacts of Diabetes, Aging, and Hearing Loss on Speech-on-Speech Masking and Spatial Release in a Large Veteran Cohort. Am J Audiol 2021; 30:1023-1036. [PMID: 34633838 DOI: 10.1044/2021_aja-21-00022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE Type 2 diabetes mellitus (DM2) is associated with impaired hearing. However, the evidence is less clear if DM2 can lead to difficulty understanding speech in complex acoustic environments, independently of age and hearing loss effects. The purpose of this study was to estimate the magnitude of DM2-related effects on speech understanding in the presence of competing speech after adjusting for age and hearing. METHOD A cross-sectional study design was used to investigate the relationship between DM2 and speech understanding in 190 Veterans (M age = 47 years, range: 25-76). Participants were classified as having no diabetes (n = 74), prediabetes (n = 19), or DM2 that was well controlled (n = 24) or poorly controlled (n = 73). A test of spatial release from masking (SRM) was presented in a virtual acoustical simulation over insert earphones with multiple talkers using sentences from the coordinate response measure corpus to determine the target-to-masker ratio (TMR) required for 50% correct identification of target speech. A linear mixed model of the TMR results was used to estimate SRM and separate effects of diabetes group, age, and low-frequency pure-tone average (PTA-low) and high-frequency pure-tone average. A separate model estimated the effects of DM2 on PTA-low. RESULTS After adjusting for hearing and age, diabetes-related effects remained among those whose DM2 was well controlled, showing an SRM loss of approximately 0.5 dB. Results also showed effects of hearing loss and age, consistent with the literature on people without DM2. Low-frequency hearing loss was greater among those with DM2. CONCLUSIONS In a large cohort of Veterans, low-frequency hearing loss and older age negatively impact speech understanding. Compared with nondiabetics, individuals with controlled DM2 have additional auditory deficits beyond those associated with hearing loss or aging. These results provide a potential explanation for why individuals who have diabetes and/or are older often report difficulty understanding speech in real-world listening environments. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.16746475.
Collapse
Affiliation(s)
- Sarah M. Theodoroff
- VA Rehabilitation Research and Development Service, National Center for Rehabilitative Auditory Research, VA Portland Health Care System, United States Department of Veterans Affairs, OR
- Department of Otolaryngology—Head & Neck Surgery, Oregon Health & Science University, Portland
| | - Frederick J. Gallun
- VA Rehabilitation Research and Development Service, National Center for Rehabilitative Auditory Research, VA Portland Health Care System, United States Department of Veterans Affairs, OR
- Department of Otolaryngology—Head & Neck Surgery, Oregon Health & Science University, Portland
| | - Garnett P. McMillan
- VA Rehabilitation Research and Development Service, National Center for Rehabilitative Auditory Research, VA Portland Health Care System, United States Department of Veterans Affairs, OR
| | - Michelle Molis
- VA Rehabilitation Research and Development Service, National Center for Rehabilitative Auditory Research, VA Portland Health Care System, United States Department of Veterans Affairs, OR
- Department of Otolaryngology—Head & Neck Surgery, Oregon Health & Science University, Portland
| | - Nirmal Srinivasan
- Department of Speech-Language Pathology & Audiology, Towson University, MD
| | - Jane Gordon
- VA Rehabilitation Research and Development Service, National Center for Rehabilitative Auditory Research, VA Portland Health Care System, United States Department of Veterans Affairs, OR
| | - Daniel McDermott
- VA Rehabilitation Research and Development Service, National Center for Rehabilitative Auditory Research, VA Portland Health Care System, United States Department of Veterans Affairs, OR
| | - Dawn Konrad-Martin
- VA Rehabilitation Research and Development Service, National Center for Rehabilitative Auditory Research, VA Portland Health Care System, United States Department of Veterans Affairs, OR
- Department of Otolaryngology—Head & Neck Surgery, Oregon Health & Science University, Portland
| |
Collapse
|
13
|
Srinivasan NK, Staudenmeier A, Clark K. Effect of gap detection threshold and localisation acuity on spatial release from masking in older adults. Int J Audiol 2021; 61:932-939. [PMID: 34793273 DOI: 10.1080/14992027.2021.1961168] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
OBJECTIVE The primary objective of this experiment was to measure the temporal and spatial processing capabilities of older individuals and use statistical models to identify the individual contributions of these temporal and spatial processing capabilities to spatial release from masking (SRM). DESIGN Repeated measures. STUDY SAMPLE Twenty-five older listeners with varying degrees of hearing loss participated in this experiment. SRM using the coordinate response measure, gap detection thresholds and localisation acuity for 1/3-octave-wide Gaussian noise bands centred at 500 and 4000 Hz were measured for all the listeners. RESULTS Older listeners had better speech recognition thresholds when target and maskers were spatially separated as compared to when they were co-located. In addition, hearing loss and localisation acuity at 500 Hz were significant predictors in a multiple regression model predicting SRM. However, gap detection thresholds did not significantly contribute to the multiple regression model predicting SRM. CONCLUSION Based on our data, we conclude that SRM at 30° spatial separation between the target and symmetric maskers is driven by the ability of the individuals to use interaural time difference cues.
Collapse
Affiliation(s)
| | - Alexis Staudenmeier
- Department of Speech-Language Pathology and Audiology, Towson University, Towson, MD, USA
| | - Kelli Clark
- Department of Speech-Language Pathology and Audiology, Towson University, Towson, MD, USA
| |
Collapse
|
14
|
Corbin NE, Buss E, Leibold LJ. Spatial Hearing and Functional Auditory Skills in Children With Unilateral Hearing Loss. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:4495-4512. [PMID: 34609204 PMCID: PMC9132156 DOI: 10.1044/2021_jslhr-20-00081] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 05/03/2021] [Accepted: 06/30/2021] [Indexed: 06/13/2023]
Abstract
Purpose The purpose of this study was to characterize spatial hearing abilities of children with longstanding unilateral hearing loss (UHL). UHL was expected to negatively impact children's sound source localization and masked speech recognition, particularly when the target and masker were separated in space. Spatial release from masking (SRM) in the presence of a two-talker speech masker was expected to predict functional auditory performance as assessed by parent report. Method Participants were 5- to 14-year-olds with sensorineural or mixed UHL, age-matched children with normal hearing (NH), and adults with NH. Sound source localization was assessed on the horizontal plane (-90° to 90°), with noise that was either all-pass, low-pass, high-pass, or an unpredictable mixture. Speech recognition thresholds were measured in the sound field for sentences presented in two-talker speech or speech-shaped noise. Target speech was always presented from 0°; the masker was either colocated with the target or spatially separated at ±90°. Parents of children with UHL rated their children's functional auditory performance in everyday environments via questionnaire. Results Sound source localization was poorer for children with UHL than those with NH. Children with UHL also derived less SRM than those with NH, with increased masking for some conditions. Effects of UHL were larger in the two-talker than the noise masker, and SRM in two-talker speech increased with age for both groups of children. Children with UHL whose parents reported greater functional difficulties achieved less SRM when either masker was on the side of the better-hearing ear. Conclusions Children with UHL are clearly at a disadvantage compared with children with NH for both sound source localization and masked speech recognition with spatial separation. Parents' report of their children's real-world communication abilities suggests that spatial hearing plays an important role in outcomes for children with UHL.
Collapse
Affiliation(s)
- Nicole E. Corbin
- Department of Communication Science and Disorders, University of Pittsburgh, PA
| | - Emily Buss
- Department of Otolaryngology—Head & Neck Surgery, School of Medicine, University of North Carolina at Chapel Hill
| | - Lori J. Leibold
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| |
Collapse
|
15
|
Effects of Simulated and Profound Unilateral Sensorineural Hearing Loss on Recognition of Speech in Competing Speech. Ear Hear 2021; 41:411-419. [PMID: 31356386 DOI: 10.1097/aud.0000000000000764] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Unilateral hearing loss (UHL) is a condition as common as bilateral hearing loss in adults. Because of the unilaterally reduced audibility associated with UHL, binaural processing of sounds may be disrupted. As a consequence, daily tasks such as listening to speech in a background of spatially distinct competing sounds may be challenging. A growing body of subjective and objective data suggests that spatial hearing is negatively affected by UHL. However, the type and degree of UHL vary considerably in previous studies. The aim here was to determine the effect of a profound sensorineural UHL, and of a simulated UHL, on recognition of speech in competing speech, and the binaural and monaural contributions to spatial release from masking, in a demanding multisource listening environment. DESIGN Nine subjects (25 to 61 years) with profound sensorineural UHL [mean pure-tone average (PTA) across 0.5, 1, 2, and 4 kHz = 105 dB HL] and normal contralateral hearing (mean PTA = 7.2 dB HL) were included based on the criterion that the target and competing speech were inaudible in the ear with hearing loss. Thirteen subjects with normal hearing (19 to 60 years; mean left PTA = 4.1 dB HL; mean right PTA = 5.5 dB HL) contributed data in normal and simulated "mild-to-moderate" UHL conditions (PTA = 38.6 dB HL). The main outcome measure was the threshold for 40% correct speech recognition in colocated (0°) and spatially and symmetrically separated (±30° and ±150°) competing speech conditions. Spatial release from masking was quantified as the threshold difference between colocated and separated conditions. RESULTS Thresholds in profound UHL were higher (worse) than normal hearing in separated and colocated conditions, and comparable to simulated UHL. Monaural spatial release from masking, that is, the spatial release achieved by subjects with profound UHL, was significantly different from zero and 49% of the magnitude of the spatial release from masking achieved by subjects with normal hearing. There were subjects with profound UHL who showed negative spatial release, whereas subjects with normal hearing consistently showed positive spatial release from masking in the normal condition. The simulated UHL had a larger effect on the speech recognition threshold for separated than for colocated conditions, resulting in decreased spatial release from masking. The difference in spatial release between normal-hearing and simulated UHL conditions increased with age. CONCLUSIONS The results demonstrate that while recognition of speech in colocated and separated competing speech is impaired for profound sensorineural UHL, spatial release from masking may be possible when competing speech is symmetrically distributed around the listener. A "mild-to-moderate" simulated UHL decreases spatial release from masking compared with normal-hearing conditions and interacts with age, indicating that small amounts of residual hearing in the UHL ear may be more beneficial for separated than for colocated interferer conditions for young listeners.
Collapse
|
16
|
Har-shai Yahav P, Zion Golumbic E. Linguistic processing of task-irrelevant speech at a cocktail party. eLife 2021; 10:e65096. [PMID: 33942722 PMCID: PMC8163500 DOI: 10.7554/elife.65096] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Accepted: 04/26/2021] [Indexed: 01/05/2023] Open
Abstract
Paying attention to one speaker in a noisy place can be extremely difficult, because to-be-attended and task-irrelevant speech compete for processing resources. We tested whether this competition is restricted to acoustic-phonetic interference or if it extends to competition for linguistic processing as well. Neural activity was recorded using Magnetoencephalography as human participants were instructed to attend to natural speech presented to one ear, and task-irrelevant stimuli were presented to the other. Task-irrelevant stimuli consisted either of random sequences of syllables, or syllables structured to form coherent sentences, using hierarchical frequency-tagging. We find that the phrasal structure of structured task-irrelevant stimuli was represented in the neural response in left inferior frontal and posterior parietal regions, indicating that selective attention does not fully eliminate linguistic processing of task-irrelevant speech. Additionally, neural tracking of to-be-attended speech in left inferior frontal regions was enhanced when competing with structured task-irrelevant stimuli, suggesting inherent competition between them for linguistic processing.
Collapse
Affiliation(s)
- Paz Har-shai Yahav
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan UniversityRamat GanIsrael
| | - Elana Zion Golumbic
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan UniversityRamat GanIsrael
| |
Collapse
|
17
|
Associations Between Hearing Health and Well-Being in Unilateral Hearing Impairment. ACTA ACUST UNITED AC 2021; 42:520-530. [DOI: 10.1097/aud.0000000000000969] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
18
|
Goverts ST, Colburn HS. Binaural Recordings in Natural Acoustic Environments: Estimates of Speech-Likeness and Interaural Parameters. Trends Hear 2021; 24:2331216520972858. [PMID: 33331242 PMCID: PMC7750905 DOI: 10.1177/2331216520972858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Binaural acoustic recordings were made in multiple natural environments, which were chosen to be similar to those reported to be difficult for listeners with impaired hearing. These environments include natural conversations that take place in the presence of other sound sources as found in restaurants, walking or biking in the city, and so on. Sounds from these environments were recorded binaurally with in-the-ear microphones and were analyzed with respect to speech-likeness measures and interaural difference measures. The speech-likeness measures were based on amplitude–modulation patterns within frequency bands and were estimated for 1-s time-slices. The interaural difference measures included interaural coherence, interaural time difference, and interaural level difference, which were estimated for time-slices of 20-ms duration. These binaural measures were documented for one-fourth-octave frequency bands centered at 500 Hz and for the envelopes of one-fourth-octave bands centered at 2000 Hz. For comparison purposes, the same speech-likeness and interaural difference measures were computed for a set of virtual recordings that mimic typical clinical test configurations. These virtual recordings were created by filtering anechoic waveforms with available head-related transfer functions and combining them to create multiple source combinations. Overall, the speech-likeness results show large variability within and between environments, and they demonstrate the importance of having information from both ears available. Furthermore, the interaural parameter results show that the natural recordings contain a relatively small proportion of time-slices with high coherence compared with the virtual recordings; however, when present, binaural cues might be used for selecting intervals with good speech intelligibility for individual sources.
Collapse
Affiliation(s)
- S Theo Goverts
- Otolaryngology-Head and Neck Surgery, Ear & Hearing, Amsterdam Public Health, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - H Steven Colburn
- Biomedical Engineering Department, Boston University, Boston, Massachusetts, United States
| |
Collapse
|
19
|
Spatial Hearing as a Function of Presentation Level in Moderate-to-Severe Unilateral Conductive Hearing Loss. Otol Neurotol 2021; 41:167-172. [PMID: 31834211 DOI: 10.1097/mao.0000000000002475] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
HYPOTHESIS Patients with moderate-to-severe unilateral conductive hearing loss (UCHL) can make use of binaural difference cues when stimuli are presented at a high enough intensity to provide audibility in the affected ear. BACKGROUND Spatial hearing is essential for listening in complex environments and sound source localization. Patients with UCHL have decreased access to binaural difference cues, resulting in poorer spatial hearing abilities compared with listeners with normal hearing. METHODS Twelve patients with moderate-to-severe UCHL, most due to atresia (83.3%), and 12 age-matched controls with normal hearing bilaterally participated in this study. Outcome measures included: 1) spatial release from masking, and 2) sound source localization. Speech reception thresholds were measured with target speech (Pediatric AzBio sentences) presented at 0 degree and a two-talker masker that was either colocated with the target (0 degree) or spatially separated from the target (symmetrical, ±90 degrees). Spatial release from masking was quantified as the difference between speech reception thresholds in these two conditions. Localization ability in the horizontal plane was assessed in a 180 degree arc of 11 evenly-spaced loudspeakers. These two tasks were completed at 50 and 75 dB SPL. RESULTS Both children and adults with UCHL performed more poorly than controls when recognizing speech in a spatially separated masker or localizing sound; however, this group difference was larger at 50 than 75 dB SPL. CONCLUSION Patients with UCHL experience improved spatial hearing with the higher presentation level, suggesting that the auditory deprivation associated with a moderate-to-severe UCHL does not preclude exposure to-or use of-binaural difference cues.
Collapse
|
20
|
Versfeld NJ, Lie S, Kramer SE, Zekveld AA. Informational masking with speech-on-speech intelligibility: Pupil response and time-course of learning. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2353. [PMID: 33940918 DOI: 10.1121/10.0003952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 03/09/2021] [Indexed: 06/12/2023]
Abstract
Previous research has shown a learning effect on speech perception in nonstationary maskers. The present study addressed the time-course of this learning effect and the role of informational masking. To that end, speech reception thresholds (SRTs) were measured for speech in either a stationary noise masker, an interrupted noise masker, or a single-talker masker. The utterance of the single talker was either time-forward (intelligible) or time-reversed (unintelligible), and the sample of the utterance was either frozen (same utterance at each presentation) or random (different utterance at each presentation but from the same speaker). Simultaneously, the pupil dilation response was measured to assess differences in the listening effort between conditions and to track changes in the listening effort over time within each condition. The results showed a learning effect for all conditions but the stationary noise condition-that is, improvement in SRT over time while maintaining equal pupil responses. There were no significant differences in pupil responses between conditions despite large differences in the SRT. Time reversal of the frozen speech affected neither the SRT nor pupil responses.
Collapse
Affiliation(s)
- Niek J Versfeld
- Amsterdam Universitair Medisch Centrum, Vrije Universiteit Amsterdam, Otolaryngology Head and Neck Surgery, Ear and Hearing, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Sisi Lie
- Amsterdam Universitair Medisch Centrum, Vrije Universiteit Amsterdam, Otolaryngology Head and Neck Surgery, Ear and Hearing, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Sophia E Kramer
- Amsterdam Universitair Medisch Centrum, Vrije Universiteit Amsterdam, Otolaryngology Head and Neck Surgery, Ear and Hearing, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Adriana A Zekveld
- Amsterdam Universitair Medisch Centrum, Vrije Universiteit Amsterdam, Otolaryngology Head and Neck Surgery, Ear and Hearing, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| |
Collapse
|
21
|
Wang X, Xu L. Speech perception in noise: Masking and unmasking. J Otol 2021; 16:109-119. [PMID: 33777124 PMCID: PMC7985001 DOI: 10.1016/j.joto.2020.12.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 12/03/2020] [Accepted: 12/06/2020] [Indexed: 11/23/2022] Open
Abstract
Speech perception is essential for daily communication. Background noise or concurrent talkers, on the other hand, can make it challenging for listeners to track the target speech (i.e., cocktail party problem). The present study reviews and compares existing findings on speech perception and unmasking in cocktail party listening environments in English and Mandarin Chinese. The review starts with an introduction section followed by related concepts of auditory masking. The next two sections review factors that release speech perception from masking in English and Mandarin Chinese, respectively. The last section presents an overall summary of the findings with comparisons between the two languages. Future research directions with respect to the difference in literature on the reviewed topic between the two languages are also discussed.
Collapse
Affiliation(s)
- Xianhui Wang
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| |
Collapse
|
22
|
Ahrens A, Cuevas-Rodriguez M, Brimijoin WO. Speech intelligibility with various head-related transfer functions: A computational modelling approach. JASA EXPRESS LETTERS 2021; 1:034401. [PMID: 36154562 DOI: 10.1121/10.0003618] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Speech intelligibility (SI) is known to be affected by the relative spatial position between target and interferers. The benefit of a spatial separation is, along with other factors, related to the head-related transfer function (HRTF). The HRTF is individually different and thus, the cues that affect SI might also be different. In the current study, an auditory model was employed to predict SI with various HRTFs and at different angles on the horizontal plane. The predicted SI threshold was found to be largely different across HRTFs. Thus, individual listeners might have different access to SI cues, dependent on their HRTF.
Collapse
Affiliation(s)
- Axel Ahrens
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | | | | |
Collapse
|
23
|
Hausfeld L, Shiell M, Formisano E, Riecke L. Cortical processing of distracting speech in noisy auditory scenes depends on perceptual demand. Neuroimage 2020; 228:117670. [PMID: 33359352 DOI: 10.1016/j.neuroimage.2020.117670] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 12/13/2020] [Accepted: 12/14/2020] [Indexed: 11/15/2022] Open
Abstract
Selective attention is essential for the processing of multi-speaker auditory scenes because they require the perceptual segregation of the relevant speech ("target") from irrelevant speech ("distractors"). For simple sounds, it has been suggested that the processing of multiple distractor sounds depends on bottom-up factors affecting task performance. However, it remains unclear whether such dependency applies to naturalistic multi-speaker auditory scenes. In this study, we tested the hypothesis that increased perceptual demand (the processing requirement posed by the scene to separate the target speech) reduces the cortical processing of distractor speech thus decreasing their perceptual segregation. Human participants were presented with auditory scenes including three speakers and asked to selectively attend to one speaker while their EEG was acquired. The perceptual demand of this selective listening task was varied by introducing an auditory cue (interaural time differences, ITDs) for segregating the target from the distractor speakers, while acoustic differences between the distractors were matched in ITD and loudness. We obtained a quantitative measure of the cortical segregation of distractor speakers by assessing the difference in how accurately speech-envelope following EEG responses could be predicted by models of averaged distractor speech versus models of individual distractor speech. In agreement with our hypothesis, results show that interaural segregation cues led to improved behavioral word-recognition performance and stronger cortical segregation of the distractor speakers. The neural effect was strongest in the δ-band and at early delays (0 - 200 ms). Our results indicate that during low perceptual demand, the human cortex represents individual distractor speech signals as more segregated. This suggests that, in addition to purely acoustical properties, the cortical processing of distractor speakers depends on factors like perceptual demand.
Collapse
Affiliation(s)
- Lars Hausfeld
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands.
| | - Martha Shiell
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands; Maastricht Centre for Systems Biology, 6200MD Maastricht, The Netherlands
| | - Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands
| |
Collapse
|
24
|
Srinivasan NK, Holtz A, Gallun FJ. Comparing Spatial Release From Masking Using Traditional Methods and Portable Automated Rapid Testing iPad App. Am J Audiol 2020; 29:907-915. [PMID: 33197327 PMCID: PMC8608168 DOI: 10.1044/2020_aja-20-00078] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 08/06/2020] [Accepted: 09/02/2020] [Indexed: 11/09/2022] Open
Abstract
Purpose The purpose of this study was to compare speech identification abilities of individuals of various ages and hearing abilities using traditional methods and Portable Automated Rapid Testing (PART) iPad app. Method Speech identification data were collected using three techniques: over headphones using a virtual speaker array, using PART iPad app (UCR Brain Game Center, 2018), and using loudspeaker presentation in a sound-attenuated room. For all three techniques, Coordinate Response Measure sentences were used as the stimuli and "Charlie" was used as the call sign. A progressive tracking procedure was used to estimate the speech identification thresholds for listeners with varying hearing thresholds. The target sentence was always presented at 0° azimuth angle, whereas the maskers were colocated (0°) with the target or symmetrically spatially separated by ±15°, ±30°, or ±45°. Results Data analysis revealed similar speech identification thresholds for the iPad and headphone conditions and slightly poorer thresholds for the loudspeaker array condition across participant groups. This was true for all spatial separations between the target and the maskers. Conclusion Strong correlation between the headphone and iPad data presented in this study indicated that the spatial release from masking module in the PART iPad app can be used as a clinical tool to assess spatial processing ability prior to audiologic evaluation in the clinic and can also be used to make recommendations for and to track progress with aural rehabilitation programs over time.
Collapse
Affiliation(s)
| | - Allison Holtz
- Department of Speech-Language Pathology & Audiology, Towson University, MD
| | - Frederick J. Gallun
- Oregon Health & Science University, Department of Otolaryngology–Head & Neck Surgery, Portland, OR
- Veterans Affairs Rehabilitation Research & Development National Center for Rehabilitative Auditory Research, VA Portland Health Care System, OR
| |
Collapse
|
25
|
Middlebrooks JC, Waters MF. Spatial Mechanisms for Segregation of Competing Sounds, and a Breakdown in Spatial Hearing. Front Neurosci 2020; 14:571095. [PMID: 33041763 PMCID: PMC7525094 DOI: 10.3389/fnins.2020.571095] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 08/21/2020] [Indexed: 01/02/2023] Open
Abstract
We live in complex auditory environments, in which we are confronted with multiple competing sounds, including the cacophony of talkers in busy markets, classrooms, offices, etc. The purpose of this article is to synthesize observations from a series of experiments that focused on how spatial hearing might aid in disentangling interleaved sequences of sounds. The experiments were unified by a non-verbal task, "rhythmic masking release", which was applied to psychophysical studies in humans and cats and to cortical physiology in anesthetized cats. Human and feline listeners could segregate competing sequences of sounds from sources that were separated by as little as ∼10°. Similarly, single neurons in the cat primary auditory cortex tended to synchronize selectively to sound sequences from one of two competing sources, again with spatial resolution of ∼10°. The spatial resolution of spatial stream segregation varied widely depending on the binaural and monaural acoustical cues that were available in various experimental conditions. This is in contrast to a measure of basic sound-source localization, the minimum audible angle, which showed largely constant acuity across those conditions. The differential utilization of acoustical cues suggests that the central spatial mechanisms for stream segregation differ from those for sound localization. The highest-acuity spatial stream segregation was derived from interaural time and level differences. Brainstem processing of those cues is thought to rely heavily on normal function of a voltage-gated potassium channel, Kv3.3. A family was studied having a dominant negative mutation in the gene for that channel. Affected family members exhibited severe loss of sensitivity for interaural time and level differences, which almost certainly would degrade their ability to segregate competing sounds in real-world auditory scenes.
Collapse
Affiliation(s)
- John C. Middlebrooks
- Departments of Otolaryngology, Neurobiology and Behavior, Cognitive Sciences, and Biomedical Engineering, University of California, Irvine, Irvine, CA, United States
| | - Michael F. Waters
- Department of Neurology, Barrow Neurological Institute, Phoenix, AZ, United States
| |
Collapse
|
26
|
Liu JS, Yu YF, Tao DD, Li Y, Ye F, Galvin JJ, Gopen Q, Fu QJ. Effects of Monaural Asymmetry and Target-Masker Similarity on Binaural Advantage in Children and Adults With Normal Hearing. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2811-2824. [PMID: 32777196 DOI: 10.1044/2020_jslhr-19-00269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose For colocated targets and maskers, binaural listening typically offers a small but significant advantage over monaural listening. This study investigated how monaural asymmetry and target-masker similarity may limit binaural advantage in adults and children. Method Ten Mandarin-speaking Chinese adults (aged 22-27 years) and 12 children (aged 7-14 years) with normal hearing participated in the study. Monaural and binaural speech recognition thresholds (SRTs) were adaptively measured for colocated competing speech. The target-masker sex was the same or different. Performance was measured using headphones for three listening conditions: left ear, right ear, and both ears. Binaural advantage was calculated relative to the poorer or better ear. Results Mean SRTs were significantly lower for adults than children. When the target-masker sex was the same, SRTs were significantly lower with the better ear than with the poorer ear or both ears (p < .05). When the target-masker sex was different, SRTs were significantly lower with the better ear or both ears than with the poorer ear (p < .05). Children and adults similarly benefitted from target-masker sex differences. Substantial monaural asymmetry was observed, but the effects of asymmetry on binaural advantage were similar between adults and children. Monaural asymmetry was significantly correlated with binaural advantage relative to the poorer ear (p = .004), but not to the better ear (p = .056). Conclusions Binaural listening may offer little advantage (or even a disadvantage) over monaural listening with the better ear, especially when competing talkers have similar vocal characteristics. Monaural asymmetry appears to limit binaural advantage in listeners with normal hearing, similar to observations in listeners with hearing impairment. While language development may limit perception of competing speech, it does not appear to limit the effects of monaural asymmetry or target-masker sex on binaural advantage.
Collapse
Affiliation(s)
- Ji-Sheng Liu
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Ya-Feng Yu
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Duo-Duo Tao
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Yi Li
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Fei Ye
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, China
| | | | - Quinton Gopen
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, CA
| |
Collapse
|
27
|
Kubiak AM, Rennies J, Ewert SD, Kollmeier B. Prediction of individual speech recognition performance in complex listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1379. [PMID: 32237817 DOI: 10.1121/10.0000759] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 01/31/2020] [Indexed: 06/11/2023]
Abstract
This study examined how well individual speech recognition thresholds in complex listening scenarios could be predicted by a current binaural speech intelligibility model. Model predictions were compared with experimental data measured for seven normal-hearing and 23 hearing-impaired listeners who differed widely in their degree of hearing loss, age, as well as performance in clinical speech tests. The experimental conditions included two masker types (multi-talker or two-talker maskers), and two spatial conditions (maskers co-located with the frontal target or symmetrically separated from the target). The results showed that interindividual variability could not be well predicted by a model including only individual audiograms. Predictions improved when an additional individual "proficiency factor" was derived from one of the experimental conditions or a standard speech test. Overall, the current model can predict individual performance relatively well (except in conditions high in informational masking), but the inclusion of age-related factors may lead to even further improvements.
Collapse
Affiliation(s)
- Aleksandra M Kubiak
- Fraunhofer IDMT, Project Group Hearing, Speech and Audio Technology, Cluster of Excellence "Hearing4all," Oldenburg, Germany
| | - Jan Rennies
- Fraunhofer IDMT, Project Group Hearing, Speech and Audio Technology, Cluster of Excellence "Hearing4all," Oldenburg, Germany
| | - Stephan D Ewert
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Birger Kollmeier
- Fraunhofer IDMT, Project Group Hearing, Speech and Audio Technology, Cluster of Excellence "Hearing4all," Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
28
|
Ahrens A, Marschall M, Dau T. The effect of spatial energy spread on sound image size and speech intelligibility. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1368. [PMID: 32237851 DOI: 10.1121/10.0000747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 01/30/2020] [Indexed: 06/11/2023]
Abstract
This study explored the relationship between perceived sound image size and speech intelligibility for sound sources reproduced over loudspeakers. Sources with varying degrees of spatial energy spread were generated using ambisonics processing. Young normal-hearing listeners estimated sound image size as well as performed two spatial release from masking (SRM) tasks with two symmetrically arranged interfering talkers. Either the target-to-masker ratio or the separation angle was varied adaptively. Results showed that the sound image size did not change systematically with the energy spread. However, a larger energy spread did result in a decreased SRM. Furthermore, the listeners needed a greater angular separation angle between the target and the interfering sources for sources with a larger energy spread. Further analysis revealed that the method employed to vary the energy spread did not lead to systematic changes in the interaural cross correlations. Future experiments with competing talkers using ambisonics or similar methods may consider the resulting energy spread in relation to the minimum separation angle between sound sources in order to avoid degradations in speech intelligibility.
Collapse
Affiliation(s)
- Axel Ahrens
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Building 352, Ørsteds Plads, 2800 Kongens Lyngby, Denmark
| | - Marton Marschall
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Building 352, Ørsteds Plads, 2800 Kongens Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Building 352, Ørsteds Plads, 2800 Kongens Lyngby, Denmark
| |
Collapse
|
29
|
Summers RJ, Roberts B. Informational masking of speech by acoustically similar intelligible and unintelligible interferers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1113. [PMID: 32113320 DOI: 10.1121/10.0000688] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 01/19/2020] [Indexed: 06/10/2023]
Abstract
Masking experienced when target speech is accompanied by a single interfering voice is often primarily informational masking (IM). IM is generally greater when the interferer is intelligible than when it is not (e.g., speech from an unfamiliar language), but the relative contributions of acoustic-phonetic and linguistic interference are often difficult to assess owing to acoustic differences between interferers (e.g., different talkers). Three-formant analogues (F1+F2+F3) of natural sentences were used as targets and interferers. Targets were presented monaurally either alone or accompanied contralaterally by interferers from another sentence (F0 = 4 semitones higher); a target-to-masker ratio (TMR) between ears of 0, 6, or 12 dB was used. Interferers were either intelligible or rendered unintelligible by delaying F2 and advancing F3 by 150 ms relative to F1, a manipulation designed to minimize spectro-temporal differences between corresponding interferers. Target-sentence intelligibility (keywords correct) was 67% when presented alone, but fell considerably when an unintelligible interferer was present (49%) and significantly further when the interferer was intelligible (41%). Changes in TMR produced neither a significant main effect nor an interaction with interferer type. Interference with acoustic-phonetic processing of the target can explain much of the impact on intelligibility, but linguistic factors-particularly interferer intrusions-also make an important contribution to IM.
Collapse
Affiliation(s)
- Robert J Summers
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | - Brian Roberts
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| |
Collapse
|
30
|
Rigato C, Reinfeldt S, Asp F. The effect of an active transcutaneous bone conduction device on spatial release from masking. Int J Audiol 2019; 59:348-359. [PMID: 31873054 DOI: 10.1080/14992027.2019.1705406] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Objective: The aim was to quantify the effect of the experimental active transcutaneous Bone Conduction Implant (BCI) on spatial release from masking (SRM) in subjects with bilateral or unilateral conductive and mixed hearing loss.Design: Measurements were performed in a sound booth with five loudspeakers at 0°, +/-30° and +/-150° azimuth. Target speech was presented frontally, and interfering speech from either the front (co-located) or surrounding (separated) loudspeakers. SRM was calculated as the difference between the separated and the co-located speech recognition threshold (SRT).Study Sample: Twelve patients (aged 22-76 years) unilaterally implanted with the BCI were included.Results: A positive SRM, reflecting a benefit of spatially separating interferers from target speech, existed for all subjects in unaided condition, and for nine subjects (75%) in aided condition. Aided SRM was lower compared to unaided in nine of the subjects. There was no difference in SRM between patients with bilateral and unilateral hearing loss. In aided situation, SRT improved only for patients with bilateral hearing loss.Conclusions: The BCI fitted unilaterally in patients with bilateral or unilateral conductive/mixed hearing loss seems to reduce SRM. However, data indicates that SRT is improved or maintained for patients with bilateral and unilateral hearing loss, respectively.
Collapse
Affiliation(s)
- Cristina Rigato
- Division of Signal Processing and Biomedical Engineering, Department of Electrical Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Sabine Reinfeldt
- Division of Signal Processing and Biomedical Engineering, Department of Electrical Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Filip Asp
- Division of Signal Processing and Biomedical Engineering, Department of Electrical Engineering, Chalmers University of Technology, Gothenburg, Sweden.,Division of Ear, Nose and Throat Diseases, Department of Clinical Science, Intervention and Technology Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
31
|
The Effects of Dynamic-range Automatic Gain Control on Sentence Intelligibility With a Speech Masker in Simulated Cochlear Implant Listening. Ear Hear 2019; 40:710-724. [PMID: 30204615 DOI: 10.1097/aud.0000000000000653] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES "Channel-linked" and "multi-band" front-end automatic gain control (AGC) were examined as alternatives to single-band, channel-unlinked AGC in simulated bilateral cochlear implant (CI) processing. In channel-linked AGC, the same gain control signal was applied to the input signals to both of the two CIs ("channels"). In multi-band AGC, gain control acted independently on each of a number of narrow frequency regions per channel. DESIGN Speech intelligibility performance was measured with a single target (to the left, at -15 or -30°) and a single, symmetrically-opposed masker (to the right) at a signal-to-noise ratio (SNR) of -2 decibels. Binaural sentence intelligibility was measured as a function of whether channel linking was present and of the number of AGC bands. Analysis of variance was performed to assess condition effects on percent correct across the two spatial arrangements, both at a high and a low AGC threshold. Acoustic analysis was conducted to compare postcompressed better-ear SNR, interaural differences, and monaural within-band envelope levels across processing conditions. RESULTS Analyses of variance indicated significant main effects of both channel linking and number of bands at low threshold, and of channel linking at high threshold. These improvements were accompanied by several acoustic changes. Linked AGC produced a more favorable better-ear SNR and better preserved broadband interaural level difference statistics, but did not reduce dynamic range as much as unlinked AGC. Multi-band AGC sometimes improved better-ear SNR statistics and always improved broadband interaural level difference statistics whenever the AGC channels were unlinked. Multi-band AGC produced output envelope levels that were higher than single-band AGC. CONCLUSIONS These results favor strategies that incorporate channel-linked AGC and multi-band AGC for bilateral CIs. Linked AGC aids speech intelligibility in spatially separated speech, but reduces the degree to which dynamic range is compressed. Combining multi-band and channel-linked AGC offsets the potential impact of diminished dynamic range with linked AGC without sacrificing the intelligibility gains observed with linked AGC.
Collapse
|
32
|
Domingo Y, Holmes E, Macpherson E, Johnsrude IS. Using spatial release from masking to estimate the magnitude of the familiar-voice intelligibility benefit. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3487. [PMID: 31795686 DOI: 10.1121/1.5133628] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Accepted: 10/23/2019] [Indexed: 06/10/2023]
Abstract
The ability to segregate simultaneous speech streams is crucial for successful communication. Recent studies have demonstrated that participants can report 10%-20% more words spoken by naturally familiar (e.g., friends or spouses) than unfamiliar talkers in two-voice mixtures. This benefit is commensurate with one of the largest benefits to speech intelligibility currently known-that which is gained by spatially separating two talkers. However, because of differences in the methods of these previous studies, the relative benefits of spatial separation and voice familiarity are unclear. Here, the familiar-voice benefit and spatial release from masking are directly compared, and it is examined if and how these two cues interact with one another. Talkers were recorded while speaking sentences from a published closed-set "matrix" task, and then listeners were presented with three different sentences played simultaneously. Each target sentence was played at 0° azimuth, and two masker sentences were symmetrically separated about the target. On average, participants reported 10%-30% more words correctly when the target sentence was spoken in a familiar than unfamiliar voice (collapsed over spatial separation conditions); it was found that participants gain a similar benefit from a familiar target as when an unfamiliar voice is separated from two symmetrical maskers by approximately 15° azimuth.
Collapse
Affiliation(s)
- Ysabel Domingo
- Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada
| | - Emma Holmes
- Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada
| | - Ewan Macpherson
- School of Communication Sciences and Disorders, University of Western Ontario, London, Ontario, Canada
| | - Ingrid S Johnsrude
- Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
33
|
Biberger T, Ewert SD. The effect of room acoustical parameters on speech reception thresholds and spatial release from masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:2188. [PMID: 31671969 DOI: 10.1121/1.5126694] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 08/30/2019] [Indexed: 06/10/2023]
Abstract
In daily life, speech intelligibility is affected by masking caused by interferers and by reverberation. For a frontal target speaker and two interfering sources symmetrically placed to either side, spatial release from masking (SRM) is observed in comparison to frontal interferers. In this case, the auditory system can make use of temporally fluctuating interaural time/phase and level differences promoting binaural unmasking (BU) and better-ear glimpsing (BEG). Reverberation affects the waveforms of the target and maskers, and the interaural differences, depending on the spatial configuration and on the room acoustical properties. In this study, the effect of room acoustics, temporal structure of the interferers, and target-masker positions on speech reception thresholds and SRM was assessed. The results were compared to an optimal better-ear glimpsing strategy to help disentangle energetic masking including effects of BU and BEG as well as informational masking (IM). In anechoic and moderate reverberant conditions, BU and BEG contributed to SRM of fluctuating speech-like maskers, while BU did not contribute in highly reverberant conditions. In highly reverberant rooms a SRM of up to 3 dB was observed for speech maskers, including effects of release from IM based on binaural cues.
Collapse
Affiliation(s)
- Thomas Biberger
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Stephan D Ewert
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
34
|
Bonacci LM, Dai L, Shinn-Cunningham BG. Weak neural signatures of spatial selective auditory attention in hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:2577. [PMID: 31671991 PMCID: PMC7273515 DOI: 10.1121/1.5129055] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 09/16/2019] [Accepted: 09/20/2019] [Indexed: 05/17/2023]
Abstract
Spatial attention may be used to select target speech in one location while suppressing irrelevant speech in another. However, if perceptual resolution of spatial cues is weak, spatially focused attention may work poorly, leading to difficulty communicating in noisy settings. In electroencephalography (EEG), the distribution of alpha (8-14 Hz) power over parietal sensors reflects the spatial focus of attention [Banerjee, Snyder, Molholm, and Foxe (2011). J. Neurosci. 31, 9923-9932; Foxe and Snyder (2011). Front. Psychol. 2, 154.] If spatial attention is degraded, however, alpha may not be modulated across parietal sensors. A previously published behavioral and EEG study found that, compared to normal-hearing (NH) listeners, hearing-impaired (HI) listeners often had higher interaural time difference thresholds, worse performance when asked to report the content of an acoustic stream from a particular location, and weaker attentional modulation of neural responses evoked by sounds in a mixture [Dai, Best, and Shinn-Cunningham (2018). Proc. Natl. Acad. Sci. U. S. A. 115, E3286]. This study explored whether these same HI listeners also showed weaker alpha lateralization during the previously reported task. In NH listeners, hemispheric parietal alpha power was greater when the ipsilateral location was attended; this lateralization was stronger when competing melodies were separated by a larger spatial difference. In HI listeners, however, alpha was not lateralized across parietal sensors, consistent with a degraded ability to use spatial features to selectively attend.
Collapse
Affiliation(s)
- Lia M Bonacci
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA
| | - Lengshi Dai
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA
| | | |
Collapse
|
35
|
A Physiologically Inspired Model for Solving the Cocktail Party Problem. J Assoc Res Otolaryngol 2019; 20:579-593. [PMID: 31392449 PMCID: PMC6889086 DOI: 10.1007/s10162-019-00732-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 07/18/2019] [Indexed: 11/05/2022] Open
Abstract
At a cocktail party, we can broadly monitor the entire acoustic scene to detect important cues (e.g., our names being called, or the fire alarm going off), or selectively listen to a target sound source (e.g., a conversation partner). It has recently been observed that individual neurons in the avian field L (analog to the mammalian auditory cortex) can display broad spatial tuning to single targets and selective tuning to a target embedded in spatially distributed sound mixtures. Here, we describe a model inspired by these experimental observations and apply it to process mixtures of human speech sentences. This processing is realized in the neural spiking domain. It converts binaural acoustic inputs into cortical spike trains using a multi-stage model composed of a cochlear filter-bank, a midbrain spatial-localization network, and a cortical network. The output spike trains of the cortical network are then converted back into an acoustic waveform, using a stimulus reconstruction technique. The intelligibility of the reconstructed output is quantified using an objective measure of speech intelligibility. We apply the algorithm to single and multi-talker speech to demonstrate that the physiologically inspired algorithm is able to achieve intelligible reconstruction of an “attended” target sentence embedded in two other non-attended masker sentences. The algorithm is also robust to masker level and displays performance trends comparable to humans. The ideas from this work may help improve the performance of hearing assistive devices (e.g., hearing aids and cochlear implants), speech-recognition technology, and computational algorithms for processing natural scenes cluttered with spatially distributed acoustic objects.
Collapse
|
36
|
Rouhbakhsh N, Mahdi J, Hwo J, Nobel B, Mousave F. Spatial hearing processing: electrophysiological documentation at subcortical and cortical levels. Int J Neurosci 2019; 129:1119-1132. [DOI: 10.1080/00207454.2019.1635129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Nematollah Rouhbakhsh
- HEARing Cooperation Research Centre, Melbourne, Australia
- Department of Audiology and Speech Pathology, School of Health Sciences, University of Melbourne, Melbourne, Australia
- National Acoustic Laboratories, Australian Hearing Hub, Macquarie University, Sydney, Australia
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Pich-e Shemiran, Tehran, Iran
| | - John Mahdi
- The New York Academy of Sciences, New York, NY, USA
| | - Jacob Hwo
- Department of Biomedical Science, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Baran Nobel
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, Queensland, Australia
| | - Fati Mousave
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, Queensland, Australia
| |
Collapse
|
37
|
Murphy CFB, Hashim E, Dillon H, Bamiou DE. British children's performance on the listening in spatialised noise-sentences test (LISN-S). Int J Audiol 2019; 58:754-760. [PMID: 31195858 DOI: 10.1080/14992027.2019.1627592] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Objective: To investigate whether British children's performance is equivalent to North American norms on the listening in spatialised noise-sentences test (LiSN-S). Design: Prospective study comparing the performance of a single British group of children to North-American norms on the LiSN-S (North American version). Study sample: The British group was composed of 46 typically developing children, aged 6-11 years 11 months, from a mainstream primary school in London. Results: No significant difference was observed between the British's group performance and the North-American norms for Low-cue, High-cue, Spatial Advantage and Total Advantage measure. The British group presented a significantly lower performance only for Talker Advantage measure (z-score: 0.35, 95% confidence interval -0.12 to -0.59). Age was significantly correlated with all unstandardised measures. Conclusion: Our results indicate that, when assessing British children, it would be appropriate to add a corrective factor of 0.35 to the z-score value obtained for the Talker Advantage in order to compare it to the North-American norms. This strategy would enable the use of LiSN-S in the UK to assess auditory stream segregation based on spatial cues.
Collapse
Affiliation(s)
- C F B Murphy
- The Ear Institute, University College London , London , UK
| | - E Hashim
- The Ear Institute, University College London , London , UK
| | - H Dillon
- Department of Linguistics, Macquarie University , Sydney , Australia.,Manchester Centre for Audiology and Deafness, University of Manchester , Manchester , UK.,National Acoustic Laboratories (NAL), Macquarie University , Macquarie Park , Australia
| | - D E Bamiou
- The Ear Institute, University College London , London , UK.,University College London Hospitals Biomedical Research Centre, National Institute for Health Research , London , UK
| |
Collapse
|
38
|
Jarollahi F, Amiri M, Jalaie S, Sameni SJ. The effects of auditory spatial training on informational masking release in elderly listeners: a study protocol for a randomized clinical trial. F1000Res 2019; 8:420. [PMID: 31354946 PMCID: PMC6652096 DOI: 10.12688/f1000research.18602.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/26/2019] [Indexed: 11/20/2022] Open
Abstract
Background: Regarding the strong auditory spatial plasticity capability of the central auditory system and the effect of short-term and long-term rehabilitation programs in elderly people, it seems that an auditory spatial training can help this population in informational masking release and better track speech in noisy environments. The main purposes of this study are developing an informational masking measurement test and an auditory spatial training program. Protocol: This study will be conducted in two parts. Part 1: develop and determine the validity of an informational masking measurement test by recruiting two groups of young (n=50) and old (n=50) participants with normal hearing who have no difficulty in understanding speech in noisy environments. Part 2 (clinical trial): two groups of 60-75-year-olds with normal hearing, who complain about difficulty in speech perception in noisy environments, will participate as control and intervention groups to examine the effect of auditory spatial training. Intervention: 15 sessions of auditory spatial training. The informational masking measurement test and Speech, Spatial and Qualities of Hearing Scale will be compared before intervention, immediately after intervention, and five weeks after intervention between the two groups. Discussion: Since auditory training programs do not deal with informational masking release, an auditory spatial training will be designed, aiming to improve hearing in noisy environments for elderly populations. Trial registration: Iranian Registry of Clinical Trials ( IRCT20190118042404N1) on 25 th February 2019.
Collapse
Affiliation(s)
- Farnoush Jarollahi
- Department of Audiology, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Marzieh Amiri
- Department of Audiology, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Shohreh Jalaie
- Department of Physiotherapy, School of Rehabilitation Sciences, Tehran University of Medical Sciences, Tehran, Iran
| | - Seyyed Jalal Sameni
- Department of Audiology, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
39
|
Rennies J, Best V, Roverud E, Kidd G. Energetic and Informational Components of Speech-on-Speech Masking in Binaural Speech Intelligibility and Perceived Listening Effort. Trends Hear 2019; 23:2331216519854597. [PMID: 31172880 PMCID: PMC6557024 DOI: 10.1177/2331216519854597] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Speech perception in complex sound fields can greatly benefit from different unmasking cues to segregate the target from interfering voices. This study investigated the role of three unmasking cues (spatial separation, gender differences, and masker time reversal) on speech intelligibility and perceived listening effort in normal-hearing listeners. Speech intelligibility and categorically scaled listening effort were measured for a female target talker masked by two competing talkers with no unmasking cues or one to three unmasking cues. In addition to natural stimuli, all measurements were also conducted with glimpsed speech-which was created by removing the time-frequency tiles of the speech mixture in which the maskers dominated the mixture-to estimate the relative amounts of informational and energetic masking as well as the effort associated with source segregation. The results showed that all unmasking cues as well as glimpsing improved intelligibility and reduced listening effort and that providing more than one cue was beneficial in overcoming informational masking. The reduction in listening effort due to glimpsing corresponded to increases in signal-to-noise ratio of 8 to 18 dB, indicating that a significant amount of listening effort was devoted to segregating the target from the maskers. Furthermore, the benefit in listening effort for all unmasking cues extended well into the range of positive signal-to-noise ratios at which speech intelligibility was at ceiling, suggesting that listening effort is a useful tool for evaluating speech-on-speech masking conditions at typical conversational levels.
Collapse
Affiliation(s)
- Jan Rennies
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
- 2 Fraunhofer Institute for Digital Media Technology IDMT, Project Group Hearing, Speech and Audio Technology, Oldenburg, Germany
- 3 Cluster of Excellence Hearing4all, Carl-von-Ossietzky University, Oldenburg, Germany
| | - Virginia Best
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Elin Roverud
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Gerald Kidd
- 1 Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| |
Collapse
|
40
|
Villard S, Kidd G. Effects of Acquired Aphasia on the Recognition of Speech Under Energetic and Informational Masking Conditions. Trends Hear 2019; 23:2331216519884480. [PMID: 31694486 PMCID: PMC7000861 DOI: 10.1177/2331216519884480] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Revised: 09/24/2019] [Accepted: 10/01/2019] [Indexed: 11/16/2022] Open
Abstract
Persons with aphasia (PWA) often report difficulty understanding spoken language in noisy environments that require listeners to identify and selectively attend to target speech while ignoring competing background sounds or “maskers.” This study compared the performance of PWA and age-matched healthy controls (HC) on a masked speech identification task and examined the consequences of different types of masking on performance. Twelve PWA and 12 age-matched HC completed a speech identification task comprising three conditions designed to differentiate between the effects of energetic and informational masking on receptive speech processing. The target and masker speech materials were taken from a closed-set matrix-style corpus, and a forced-choice word identification task was used. Target and maskers were spatially separated from one another in order to simulate real-world listening environments and allow listeners to make use of binaural cues for source segregation. Individualized frequency-specific gain was applied to compensate for the effects of hearing loss. Although both groups showed similar susceptibility to the effects of energetic masking, PWA were more susceptible than age-matched HC to the effects of informational masking. Results indicate that this increased susceptibility cannot be attributed to age, hearing loss, or comprehension deficits and is therefore a consequence of acquired cognitive-linguistic impairments associated with aphasia. This finding suggests that aphasia may result in increased difficulty segregating target speech from masker speech, which in turn may have implications for the ability of PWA to comprehend target speech in multitalker environments, such as restaurants, family gatherings, and other everyday situations.
Collapse
Affiliation(s)
- Sarah Villard
- Department of Speech, Language & Hearing Sciences,
Boston University, MA, USA
| | - Gerald Kidd
- Department of Speech, Language & Hearing Sciences,
Boston University, MA, USA
| |
Collapse
|
41
|
Kidd G, Mason CR, Best V, Roverud E, Swaminathan J, Jennings T, Clayton K, Steven Colburn H. Determining the energetic and informational components of speech-on-speech masking in listeners with sensorineural hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:440. [PMID: 30710924 PMCID: PMC6347574 DOI: 10.1121/1.5087555] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 11/19/2018] [Accepted: 12/18/2018] [Indexed: 05/20/2023]
Abstract
The ability to identify the words spoken by one talker masked by two or four competing talkers was tested in young-adult listeners with sensorineural hearing loss (SNHL). In a reference/baseline condition, masking speech was colocated with target speech, target and masker talkers were female, and the masker was intelligible. Three comparison conditions included replacing female masker talkers with males, time-reversal of masker speech, and spatial separation of sources. All three variables produced significant release from masking. To emulate energetic masking (EM), stimuli were subjected to ideal time-frequency segregation retaining only the time-frequency units where target energy exceeded masker energy. Subjects were then tested with these resynthesized "glimpsed stimuli." For either two or four maskers, thresholds only varied about 3 dB across conditions suggesting that EM was roughly equal. Compared to normal-hearing listeners from an earlier study [Kidd, Mason, Swaminathan, Roverud, Clayton, and Best, J. Acoust. Soc. Am. 140, 132-144 (2016)], SNHL listeners demonstrated both greater energetic and informational masking as well as higher glimpsed thresholds. Individual differences were correlated across masking release conditions suggesting that listeners could be categorized according to their general ability to solve the task. Overall, both peripheral and central factors appear to contribute to the higher thresholds for SNHL listeners.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Jayaganesh Swaminathan
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Todd Jennings
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Kameron Clayton
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - H Steven Colburn
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
42
|
Jakien KM, Gallun FJ. Normative Data for a Rapid, Automated Test of Spatial Release From Masking. Am J Audiol 2018; 27:529-538. [PMID: 30458523 PMCID: PMC6436452 DOI: 10.1044/2018_aja-17-0069] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 01/20/2018] [Indexed: 12/02/2022] Open
Abstract
Purpose The purpose of this study is to report normative data and predict thresholds for a rapid test of spatial release from masking for speech perception. The test is easily administered and has good repeatability, with the potential to be used in clinics and laboratories. Normative functions were generated for adults varying in age and amounts of hearing loss. Method The test of spatial release presents a virtual auditory scene over headphones with 2 conditions: colocated (with target and maskers at 0°) and spatially separated (with target at 0° and maskers at ± 45°). Listener thresholds are determined as target-to-masker ratios, and spatial release from masking (SRM) is determined as the difference between the colocated condition and spatially separated condition. Multiple linear regression was used to fit the data from 82 adults 18–80 years of age with normal to moderate hearing loss (0–40 dB HL pure-tone average [PTA]). The regression equations were then used to generate normative functions that relate age (in years) and hearing thresholds (as PTA) to target-to-masker ratios and SRM. Results Normative functions were able to predict thresholds with an error of less than 3.5 dB in all conditions. In the colocated condition, the function included only age as a predictive parameter, whereas in the spatially separated condition, both age and PTA were included as parameters. For SRM, PTA was the only significant predictor. Different functions were generated for the 1st run, the 2nd run, and the average of the 2 runs. All 3 functions were largely similar in form, with the smallest error being associated with the function on the basis of the average of 2 runs. Conclusion With the normative functions generated from this data set, it would be possible for a researcher or clinician to interpret data from a small number of participants or even a single patient without having to first collect data from a control group, substantially reducing the time and resources needed. Supplemental Material https://doi.org/10.23641/asha.7080878
Collapse
Affiliation(s)
- Kasey M. Jakien
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Department of Veterans Affairs, OR
- Department of Otolaryngology–Head & Neck Surgery, Oregon Health and Science University, Portland
| | - Frederick J. Gallun
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Department of Veterans Affairs, OR
- Department of Otolaryngology–Head & Neck Surgery, Oregon Health and Science University, Portland
| |
Collapse
|
43
|
Rennies J, Kidd G. Benefit of binaural listening as revealed by speech intelligibility and listening effort. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2147. [PMID: 30404476 PMCID: PMC6185866 DOI: 10.1121/1.5057114] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 09/13/2018] [Accepted: 09/13/2018] [Indexed: 05/22/2023]
Abstract
In contrast to the well-known benefits for speech intelligibility, the advantage afforded by binaural stimulus presentation for reducing listening effort has not been thoroughly examined. This study investigated spatial release of listening effort and its relation to binaural speech intelligibility in listeners with normal hearing. Psychometric functions for speech intelligibility of a frontal target talker masked by a stationary speech-shaped noise were estimated for several different noise azimuths, different degrees of reverberation, and by maintaining only interaural level or time differences. For each of these conditions, listening effort was measured using a categorical scaling procedure. The results revealed that listening effort was significantly reduced when target and masker were spatially separated in anechoic conditions. This effect extended well into the range of signal-to-noise ratios (SNRs) in which speech intelligibility was at ceiling, and disappeared only at the highest SNRs. In reverberant conditions, spatial release from listening effort was observed for high, but not low, direct-to-reverberant ratios. The findings suggest that listening effort assessment can be a useful method for revealing the benefits of spatial separation of sources under realistic listening conditions comprising favorable SNRs and low reverberation, which typically are not apparent by other means.
Collapse
Affiliation(s)
- Jan Rennies
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
44
|
Factors Affecting Speech Reception in Background Noise with a Vocoder Implementation of the FAST Algorithm. J Assoc Res Otolaryngol 2018; 19:467-478. [PMID: 29744731 DOI: 10.1007/s10162-018-0672-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 04/23/2018] [Indexed: 10/16/2022] Open
Abstract
Speech segregation in background noise remains a difficult task for individuals with hearing loss. Several signal processing strategies have been developed to improve the efficacy of hearing assistive technologies in complex listening environments. The present study measured speech reception thresholds in normal-hearing listeners attending to a vocoder based on the Fundamental Asynchronous Stimulus Timing algorithm (FAST: Smith et al. 2014), which triggers pulses based on the amplitudes of channel magnitudes in order to preserve envelope timing cues, with two different reconstruction bandwidths (narrowband and broadband) to control the degree of spectrotemporal resolution. Five types of background noise were used including same male talker, female talker, time-reversed male talker, time-reversed female talker, and speech-shaped noise to probe the contributions of different types of speech segregation cues and to elucidate how degradation affects speech reception across these conditions. Maskers were spatialized using head-related transfer functions in order to create co-located and spatially separated conditions. Results indicate that benefits arising from voicing and spatial cues can be preserved using the FAST algorithm but are reduced with a reduction in spectral resolution.
Collapse
|
45
|
Davis TJ, Gifford RH. Spatial Release From Masking in Adults With Bilateral Cochlear Implants: Effects of Distracter Azimuth and Microphone Location. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:752-761. [PMID: 29450488 PMCID: PMC5963045 DOI: 10.1044/2017_jslhr-h-16-0441] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Revised: 08/20/2017] [Accepted: 10/04/2017] [Indexed: 06/01/2023]
Abstract
PURPOSE The primary purpose of this study was to derive spatial release from masking (SRM) performance-azimuth functions for bilateral cochlear implant (CI) users to provide a thorough description of SRM as a function of target/distracter spatial configuration. The secondary purpose of this study was to investigate the effect of the microphone location for SRM in a within-subject study design. METHOD Speech recognition was measured in 12 adults with bilateral CIs for 11 spatial separations ranging from -90° to +90° in 20° steps using an adaptive block design. Five of the 12 participants were tested with both the behind-the-ear microphones and a T-mic configuration to further investigate the effect of mic location on SRM. RESULTS SRM can be significantly affected by the hemifield origin of the distracter stimulus-particularly for listeners with interaural asymmetry in speech understanding. The greatest SRM was observed with a distracter positioned 50° away from the target. There was no effect of mic location on SRM for the current experimental design. CONCLUSION Our results demonstrate that the traditional assessment of SRM with a distracter positioned at 90° azimuth may underestimate maximum performance for individuals with bilateral CIs.
Collapse
Affiliation(s)
- Timothy J. Davis
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN
| | - René H. Gifford
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN
| |
Collapse
|
46
|
Corbin NE, Buss E, Leibold LJ. Spatial Release From Masking in Children: Effects of Simulated Unilateral Hearing Loss. Ear Hear 2018; 38:223-235. [PMID: 27787392 PMCID: PMC5321780 DOI: 10.1097/aud.0000000000000376] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The purpose of this study was twofold: (1) to determine the effect of an acute simulated unilateral hearing loss on children's spatial release from masking in two-talker speech and speech-shaped noise, and (2) to develop a procedure to be used in future studies that will assess spatial release from masking in children who have permanent unilateral hearing loss. There were three main predictions. First, spatial release from masking was expected to be larger in two-talker speech than in speech-shaped noise. Second, simulated unilateral hearing loss was expected to worsen performance in all listening conditions, but particularly in the spatially separated two-talker speech masker. Third, spatial release from masking was expected to be smaller for children than for adults in the two-talker masker. DESIGN Participants were 12 children (8.7 to 10.9 years) and 11 adults (18.5 to 30.4 years) with normal bilateral hearing. Thresholds for 50%-correct recognition of Bamford-Kowal-Bench sentences were measured adaptively in continuous two-talker speech or speech-shaped noise. Target sentences were always presented from a loudspeaker at 0° azimuth. The masker stimulus was either co-located with the target or spatially separated to +90° or -90° azimuth. Spatial release from masking was quantified as the difference between thresholds obtained when the target and masker were co-located and thresholds obtained when the masker was presented from +90° or -90° azimuth. Testing was completed both with and without a moderate simulated unilateral hearing loss, created with a foam earplug and supra-aural earmuff. A repeated-measures design was used to compare performance between children and adults, and performance in the no-plug and simulated-unilateral-hearing-loss conditions. RESULTS All listeners benefited from spatial separation of target and masker stimuli on the azimuth plane in the no-plug listening conditions; this benefit was larger in two-talker speech than in speech-shaped noise. In the simulated-unilateral-hearing-loss conditions, a positive spatial release from masking was observed only when the masker was presented ipsilateral to the simulated unilateral hearing loss. In the speech-shaped noise masker, spatial release from masking in the no-plug condition was similar to that obtained when the masker was presented ipsilateral to the simulated unilateral hearing loss. In contrast, in the two-talker speech masker, spatial release from masking in the no-plug condition was much larger than that obtained when the masker was presented ipsilateral to the simulated unilateral hearing loss. When either masker was presented contralateral to the simulated unilateral hearing loss, spatial release from masking was negative. This pattern of results was observed for both children and adults, although children performed more poorly overall. CONCLUSIONS Children and adults with normal bilateral hearing experience greater spatial release from masking for a two-talker speech than a speech-shaped noise masker. Testing in a two-talker speech masker revealed listening difficulties in the presence of disrupted binaural input that were not observed in a speech-shaped noise masker. This procedure offers promise for the assessment of spatial release from masking in children with permanent unilateral hearing loss.
Collapse
Affiliation(s)
- Nicole E. Corbin
- Department of Allied Health Sciences, Division of Speech and Hearing Sciences, University of North Carolina at Chapel Hill, School of Medicine, Chapel Hill, NC, USA
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, School of Medicine, Chapel Hill, NC, USA
| | | |
Collapse
|
47
|
Jakien KM, Kampel SD, Gordon SY, Gallun FJ. The Benefits of Increased Sensation Level and Bandwidth for Spatial Release From Masking. Ear Hear 2018; 38:e13-e21. [PMID: 27556520 PMCID: PMC5161636 DOI: 10.1097/aud.0000000000000352] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 06/03/2016] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Spatial release from masking (SRM) can increase speech intelligibility in complex listening environments. The goal of the present study was to document how speech-in-speech stimuli could be best processed to encourage optimum SRM for listeners who represent a range of ages and amounts of hearing loss. We examined the effects of equating stimulus audibility among listeners, presenting stimuli at uniform sensation levels (SLs), and filtering stimuli at two separate bandwidths. DESIGN Seventy-one participants completed two speech intelligibility experiments (36 listeners in experiment 1; all 71 in experiment 2) in which a target phrase from the coordinate response measure (CRM) and two masking phrases from the CRM were presented simultaneously via earphones using a virtual spatial array, such that the target sentence was always at 0 degree azimuth angle and the maskers were either colocated or positioned at ±45 degrees. Experiments 1 and 2 examined the impacts of SL, age, and hearing loss on SRM. Experiment 2 also assessed the effects of stimulus bandwidth on SRM. RESULTS Overall, listeners' ability to achieve SRM improved with increased SL. Younger listeners with less hearing loss achieved more SRM than older or hearing-impaired listeners. It was hypothesized that SL and bandwidth would result in dissociable effects on SRM. However, acoustical analysis revealed that effective audible bandwidth, defined as the highest frequency at which the stimulus was audible at both ears, was the best predictor of performance. Thus, increasing SL seemed to improve SRM by increasing the effective bandwidth rather than increasing the level of already audible components. CONCLUSIONS Performance for all listeners, regardless of age or hearing loss, improved with an increase in overall SL and/or bandwidth, but the improvement was small relative to the benefits of spatial separation.
Collapse
Affiliation(s)
- Kasey M. Jakien
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| | - Sean D. Kampel
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| | - Samuel Y. Gordon
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| | - Frederick J. Gallun
- Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland, Oregon, USA; and Department of Veterans Affairs, Portland VA Medical Center, National Center for Rehabilitative Auditory Research, Portland, Oregon, USA
| |
Collapse
|
48
|
Having Two Ears Facilitates the Perceptual Separation of Concurrent Talkers for Bilateral and Single-Sided Deaf Cochlear Implantees. Ear Hear 2018; 37:289-302. [PMID: 26886027 DOI: 10.1097/aud.0000000000000284] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Listening to speech with multiple competing talkers requires the perceptual separation of the target voice from the interfering background. Normal-hearing listeners are able to take advantage of perceived differences in the spatial locations of competing sound sources to facilitate this process. Previous research suggests that bilateral (BI) cochlear-implant (CI) listeners cannot do so, and it is unknown whether single-sided deaf (SSD) CI users (one acoustic and one CI ear) have this ability. This study investigated whether providing a second ear via cochlear implantation can facilitate the perceptual separation of targets and interferers in a listening situation involving multiple competing talkers. DESIGN BI-CI and SSD-CI listeners were required to identify speech from a target talker mixed with one or two interfering talkers. In the baseline monaural condition, the target speech and the interferers were presented to one of the CIs (for the BI-CI listeners) or to the acoustic ear (for the SSD-CI listeners). In the bilateral condition, the target was still presented to the first ear but the interferers were presented to both the target ear and the listener's second ear (always a CI), thereby testing whether CI listeners could use information about the interferer obtained from a second ear to facilitate perceptual separation of the target and interferer. RESULTS Presenting a copy of the interfering signals to the second ear improved performance, up to 4 to 5 dB (12 to 18 percentage points), but the amount of improvement depended on the type of interferer. For BI-CI listeners, the improvement occurred mainly in conditions involving one interfering talker, regardless of gender. For SSD-CI listeners, the improvement occurred in conditions involving one or two interfering talkers of the same gender as the target. This interaction is consistent with the idea that the SSD-CI listeners had access to pitch cues in their normal-hearing ear to separate the opposite-gender target and interferers, while the BI-CI listeners did not. CONCLUSIONS These results suggest that a second auditory input via a CI can facilitate the perceptual separation of competing talkers in situations where monaural cues are insufficient to do so, thus partially restoring a key advantage of having two ears that was previously thought to be inaccessible to CI users.
Collapse
|
49
|
Mi J, Groll M, Colburn HS. Comparison of a target-equalization-cancellation approach and a localization approach to source separation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:2933. [PMID: 29195469 PMCID: PMC5685812 DOI: 10.1121/1.5009763] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Revised: 10/19/2017] [Accepted: 10/19/2017] [Indexed: 06/07/2023]
Abstract
Interaural differences are important for listeners to be able to maintain focus on a sound source of interest in the presence of multiple sources. Because interaural differences are sound localization cues, most binaural-cue-based source separation algorithms attempt separation by localizing each time-frequency (T-F) unit to one of the possible source directions using interaural differences. By assembling T-F units that are assigned to one direction, the sound stream from that direction is enhanced. In this paper, a different type of binaural cue for source-separation purposes is proposed. For each T-F unit, the target-direction signal is cancelled by applying the equalization-cancellation (EC) operation to cancel the signal from the target direction; then, the dominance of the target in each T-F unit is determined by the effectiveness of the cancellation. Specifically, the energy change from cancellation is used as the criterion for target dominance for each T-F unit. Source-separation performance using the target-EC cue is compared with performance using localization cues. With simulated multi-talker and diffuse-babble interferers, the algorithm based on target-EC cues yields better source-separation performance than the algorithm based on localization cues, both in direct comparison with the ideal binary mask and in measured speech intelligibility for the separated target streams.
Collapse
Affiliation(s)
- Jing Mi
- Hearing Research Center, Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA
| | - Matti Groll
- Hearing Research Center, Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA
| | - H Steven Colburn
- Hearing Research Center, Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA
| |
Collapse
|
50
|
Kidd G. Enhancing Auditory Selective Attention Using a Visually Guided Hearing Aid. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:3027-3038. [PMID: 29049603 PMCID: PMC5945072 DOI: 10.1044/2017_jslhr-h-17-0071] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 07/28/2017] [Accepted: 07/31/2017] [Indexed: 05/27/2023]
Abstract
PURPOSE Listeners with hearing loss, as well as many listeners with clinically normal hearing, often experience great difficulty segregating talkers in a multiple-talker sound field and selectively attending to the desired "target" talker while ignoring the speech from unwanted "masker" talkers and other sources of sound. This listening situation forms the classic "cocktail party problem" described by Cherry (1953) that has received a great deal of study over the past few decades. In this article, a new approach to improving sound source segregation and enhancing auditory selective attention is described. The conceptual design, current implementation, and results obtained to date are reviewed and discussed in this article. METHOD This approach, embodied in a prototype "visually guided hearing aid" (VGHA) currently used for research, employs acoustic beamforming steered by eye gaze as a means for improving the ability of listeners to segregate and attend to one sound source in the presence of competing sound sources. RESULTS The results from several studies demonstrate that listeners with normal hearing are able to use an attention-based "spatial filter" operating primarily on binaural cues to selectively attend to one source among competing spatially distributed sources. Furthermore, listeners with sensorineural hearing loss generally are less able to use this spatial filter as effectively as are listeners with normal hearing especially in conditions high in "informational masking." The VGHA enhances auditory spatial attention for speech-on-speech masking and improves signal-to-noise ratio for conditions high in "energetic masking." Visual steering of the beamformer supports the coordinated actions of vision and audition in selective attention and facilitates following sound source transitions in complex listening situations. CONCLUSIONS Both listeners with normal hearing and with sensorineural hearing loss may benefit from the acoustic beamforming implemented by the VGHA, especially for nearby sources in less reverberant sound fields. Moreover, guiding the beam using eye gaze can be an effective means of sound source enhancement for listening conditions where the target source changes frequently over time as often occurs during turn-taking in a conversation. PRESENTATION VIDEO http://cred.pubs.asha.org/article.aspx?articleid=2601621.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language, and Hearing Sciences and Hearing Research Center, Boston University, MA
| |
Collapse
|