1
|
Smith TM, Shen Y, Williams CN, Kidd GR, McAuley JD. Contribution of speech rhythm to understanding speech in noisy conditions: Further test of a selective entrainment hypothesis. Atten Percept Psychophys 2024; 86:627-642. [PMID: 38012475 DOI: 10.3758/s13414-023-02815-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2023] [Indexed: 11/29/2023]
Abstract
Previous work by McAuley et al. Attention, Perception, & Psychophysics, 82, 3222-3233, (2020), Attention, Perception & Psychophysics, 83, 2229-2240, (2021) showed that disruption of the natural rhythm of target (attended) speech worsens speech recognition in the presence of competing background speech or noise (a target-rhythm effect), while disruption of background speech rhythm improves target recognition (a background-rhythm effect). While these results were interpreted as support for the role of rhythmic regularities in facilitating target-speech recognition amidst competing backgrounds (in line with a selective entrainment hypothesis), questions remain about the factors that contribute to the target-rhythm effect. Experiment 1 ruled out the possibility that the target-rhythm effect relies on a decrease in intelligibility of the rhythm-altered keywords. Sentences from the Coordinate Response Measure (CRM) paradigm were presented with a background of speech-shaped noise, and the rhythm of the initial portion of these target sentences (the target rhythmic context) was altered while critically leaving the target Color and Number keywords intact. Results showed a target-rhythm effect, evidenced by poorer keyword recognition when the target rhythmic context was altered, despite the absence of rhythmic manipulation of the keywords. Experiment 2 examined the influence of the relative onset asynchrony between target and background keywords. Results showed a significant target-rhythm effect that was independent of the effect of target-background keyword onset asynchrony. Experiment 3 provided additional support for the selective entrainment hypothesis by replicating the target-rhythm effect with a set of speech materials that were less rhythmically constrained than the CRM sentences.
Collapse
Affiliation(s)
- Toni M Smith
- Department of Psychology, Michigan State University, East Lansing, MI, USA.
| | - Yi Shen
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Christina N Williams
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Gary R Kidd
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - J Devin McAuley
- Department of Psychology, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
2
|
Oh Y, Friggle P, Kinder J, Tilbrook G, Bridges SE. Effects of presentation level on speech-on-speech masking by voice-gender difference and spatial separation between talkers. Front Neurosci 2023; 17:1282764. [PMID: 38192513 PMCID: PMC10773857 DOI: 10.3389/fnins.2023.1282764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 11/30/2023] [Indexed: 01/10/2024] Open
Abstract
Many previous studies have reported that speech segregation performance in multi-talker environments can be enhanced by two major acoustic cues: (1) voice-characteristic differences between talkers; (2) spatial separation between talkers. Here, the improvement they can provide for speech segregation is referred to as "release from masking." The goal of this study was to investigate how masking release performance with two cues is affected by various target presentation levels. Sixteen normal-hearing listeners participated in the speech recognition in noise experiment. Speech-on-speech masking performance was measured as the threshold target-to-masker ratio needed to understand a target talker in the presence of either same- or different-gender masker talkers to manipulate the voice-gender difference cue. These target-masker gender combinations were tested with five spatial configurations (maskers co-located or 15°, 30°, 45°, and 60° symmetrically spatially separated from the target) to manipulate the spatial separation cue. In addition, those conditions were repeated at three target presentation levels (30, 40, and 50 dB sensation levels). Results revealed that the amount of masking release by either voice-gender difference or spatial separation cues was significantly affected by the target level, especially at the small target-masker spatial separation (±15°). Further, the results showed that the intersection points between two masking release types (equal perceptual weighting) could be varied by the target levels. These findings suggest that the perceptual weighting of masking release from two cues is non-linearly related to the target levels. The target presentation level could be one major factor associated with masking release performance in normal-hearing listeners.
Collapse
Affiliation(s)
- Yonghee Oh
- Department of Otolaryngology-Head and Neck Surgery and Communicative Disorders, University of Louisville, Louisville, KY, United States
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, FL, United States
| | - Phillip Friggle
- Department of Otolaryngology-Head and Neck Surgery and Communicative Disorders, University of Louisville, Louisville, KY, United States
| | - Josephine Kinder
- Department of Otolaryngology-Head and Neck Surgery and Communicative Disorders, University of Louisville, Louisville, KY, United States
| | - Grace Tilbrook
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, FL, United States
| | - Sarah E. Bridges
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, FL, United States
| |
Collapse
|
3
|
Uhrig S, Perkis A, Möller S, Svensson UP, Behne DM. Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification. Front Neurosci 2022; 15:730744. [PMID: 35153653 PMCID: PMC8831717 DOI: 10.3389/fnins.2021.730744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 12/13/2021] [Indexed: 11/28/2022] Open
Abstract
This study investigates effects of spatial auditory cues on human listeners' response strategy for identifying two alternately active talkers (“turn-taking” listening scenario). Previous research has demonstrated subjective benefits of audio spatialization with regard to speech intelligibility and talker-identification effort. So far, the deliberate activation of specific perceptual and cognitive processes by listeners to optimize their task performance remained largely unexamined. Spoken sentences selected as stimuli were either clean or degraded due to background noise or bandpass filtering. Stimuli were presented via three horizontally positioned loudspeakers: In a non-spatial mode, both talkers were presented through a central loudspeaker; in a spatial mode, each talker was presented through the central or a talker-specific lateral loudspeaker. Participants identified talkers via speeded keypresses and afterwards provided subjective ratings (speech quality, speech intelligibility, voice similarity, talker-identification effort). In the spatial mode, presentations at lateral loudspeaker locations entailed quicker behavioral responses, which were significantly slower in comparison to a talker-localization task. Under clean speech, response times globally increased in the spatial vs. non-spatial mode (across all locations); these “response time switch costs,” presumably being caused by repeated switching of spatial auditory attention between different locations, diminished under degraded speech. No significant effects of spatialization on subjective ratings were found. The results suggested that when listeners could utilize task-relevant auditory cues about talker location, they continued to rely on voice recognition instead of localization of talker sound sources as primary response strategy. Besides, the presence of speech degradations may have led to increased cognitive control, which in turn compensated for incurring response time switch costs.
Collapse
Affiliation(s)
- Stefan Uhrig
- Department of Electronic Systems, Norwegian University of Science and Technology, Trondheim, Norway
- Quality and Usability Lab, Technische Universität Berlin, Berlin, Germany
- *Correspondence: Stefan Uhrig
| | - Andrew Perkis
- Department of Electronic Systems, Norwegian University of Science and Technology, Trondheim, Norway
| | - Sebastian Möller
- Quality and Usability Lab, Technische Universität Berlin, Berlin, Germany
- Speech and Language Technology, German Research Center for Artificial Intelligence, Berlin, Germany
| | - U. Peter Svensson
- Department of Electronic Systems, Norwegian University of Science and Technology, Trondheim, Norway
| | - Dawn M. Behne
- Department of Psychology, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
4
|
Oh Y, Hartling CL, Srinivasan NK, Diedesch AC, Gallun FJ, Reiss LAJ. Factors underlying masking release by voice-gender differences and spatial separation cues in multi-talker listening environments in listeners with and without hearing loss. Front Neurosci 2022; 16:1059639. [PMID: 36507363 PMCID: PMC9726925 DOI: 10.3389/fnins.2022.1059639] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 11/07/2022] [Indexed: 11/24/2022] Open
Abstract
Voice-gender differences and spatial separation are important cues for auditory object segregation. The goal of this study was to investigate the relationship of voice-gender difference benefit to the breadth of binaural pitch fusion, the perceptual integration of dichotic stimuli that evoke different pitches across ears, and the relationship of spatial separation benefit to localization acuity, the ability to identify the direction of a sound source. Twelve bilateral hearing aid (HA) users (age from 30 to 75 years) and eleven normal hearing (NH) listeners (age from 36 to 67 years) were tested in the following three experiments. First, speech-on-speech masking performance was measured as the threshold target-to-masker ratio (TMR) needed to understand a target talker in the presence of either same- or different-gender masker talkers. These target-masker gender combinations were tested with two spatial configurations (maskers co-located or 60° symmetrically spatially separated from the target) in both monaural and binaural listening conditions. Second, binaural pitch fusion range measurements were conducted using harmonic tone complexes around a 200-Hz fundamental frequency. Third, absolute localization acuity was measured using broadband (125-8000 Hz) noise and one-third octave noise bands centered at 500 and 3000 Hz. Voice-gender differences between target and maskers improved TMR thresholds for both listener groups in the binaural condition as well as both monaural (left ear and right ear) conditions, with greater benefit in co-located than spatially separated conditions. Voice-gender difference benefit was correlated with the breadth of binaural pitch fusion in the binaural condition, but not the monaural conditions, ruling out a role of monaural abilities in the relationship between binaural fusion and voice-gender difference benefits. Spatial separation benefit was not significantly correlated with absolute localization acuity. In addition, greater spatial separation benefit was observed in NH listeners than in bilateral HA users, indicating a decreased ability of HA users to benefit from spatial release from masking (SRM). These findings suggest that sharp binaural pitch fusion may be important for maximal speech perception in multi-talker environments for both NH listeners and bilateral HA users.
Collapse
Affiliation(s)
- Yonghee Oh
- Department of Otolaryngology and Communicative Disorders, University of Louisville, Louisville, KY, United States
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR, United States
- *Correspondence: Yonghee Oh,
| | - Curtis L. Hartling
- Department of Otolaryngology, Oregon Health & Science University, Portland, OR, United States
| | - Nirmal Kumar Srinivasan
- Department of Speech-Language Pathology & Audiology, Towson University, Towson, MD, United States
| | - Anna C. Diedesch
- Department of Communication Sciences and Disorders, Western Washington University, Bellingham, WA, United States
| | - Frederick J. Gallun
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR, United States
- Department of Otolaryngology, Oregon Health & Science University, Portland, OR, United States
| | - Lina A. J. Reiss
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR, United States
- Department of Otolaryngology, Oregon Health & Science University, Portland, OR, United States
| |
Collapse
|
5
|
Müller V, Lang-Roth R. Speech Recognition With Informational and Energetic Maskers in Patients With Single-Sided Deafness After Cochlear Implantation. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3343-3356. [PMID: 34310192 DOI: 10.1044/2021_jslhr-20-00677] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose The aim of the study was to assess the susceptibility to energetic and informational masking in patients with single-sided deafness (SSD) with one normal-hearing (NH) ear and a cochlear implant (CI) in the contralateral ear, understand the effect on speech recognition when spatially separating noise and speech maskers, and investigate the influence of the CI in situations with energetic and informational masking. Method Speech recognition was measured in the presence of either a modulated speech-shaped noise or one of two competing speech maskers in 11 SSD-CI listeners. The speech maskers were manipulated with respect to fundamental frequency to consider the effect of different voices. Measurements were conducted in the unaided (NH) and aided (NHCI) conditions. Spatial release from masking (SRM) was calculated for each masker type and both listening conditions (NH and NHCI) by subtracting scores of the colocated target and masker condition (S0N0) from the spatially separated target and masker conditions (S0N≠0). Results Speech recognition was highly variable depending on the type of masker. SRM occurred in the unaided (NH) and aided (NHCI) conditions when the speech masker had the same gender as the target talker. Adding the CI improved speech recognition when this speech masker was ipsilateral to the NH ear. Conclusions The amount of informational masking is substantial in SSD-CI listeners with both colocated and spatially separated target and masker signals. The contribution of SRM to better speech recognition largely depends on the masker and is considerable when no differences in voices between the target and the competing talker occur. There is only a slight improvement in speech recognition by adding the CI.
Collapse
Affiliation(s)
- Verena Müller
- Department of Otorhinolaryngology, Head and Neck Surgery, Faculty of Medicine, University of Cologne, Germany
| | - Ruth Lang-Roth
- Department of Otorhinolaryngology, Head and Neck Surgery, Faculty of Medicine, University of Cologne, Germany
| |
Collapse
|
6
|
Oh Y, Bridges SE, Schoenfeld H, Layne AO, Eddins D. Interaction between voice-gender difference and spatial separation in release from masking in multi-talker listening environments. JASA EXPRESS LETTERS 2021; 1:084404. [PMID: 34713273 PMCID: PMC8547139 DOI: 10.1121/10.0005831] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 07/20/2021] [Indexed: 05/21/2023]
Abstract
Voice-gender difference and spatial separation between talkers are important cues for speech segregation in multi-talker listening environments. The goal of this study was to investigate the interactions of these two cues to explore how they influence masking release in normal hearing listeners. Speech recognition thresholds in competing speech were measured, and masking release benefits by either voice-gender difference or spatial separation cues were calculated. Results revealed that the masking releases by those two cues are inversely related as a function of spatial separation, with a gender-specific difference of transition between the two types of masking release.
Collapse
Affiliation(s)
- Yonghee Oh
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, Florida 32610, USA
| | - Sarah E Bridges
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, Florida 32610, USA
| | - Hannah Schoenfeld
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, Florida 32610, USA
| | - Allison O Layne
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, Florida 32610, USA
| | - David Eddins
- Department of Communication Science and Disorders, University of South Florida, Tampa, Florida 33612, USA , , , ,
| |
Collapse
|
7
|
McAuley JD, Shen Y, Smith T, Kidd GR. Effects of speech-rhythm disruption on selective listening with a single background talker. Atten Percept Psychophys 2021; 83:2229-2240. [PMID: 33782913 PMCID: PMC10612531 DOI: 10.3758/s13414-021-02298-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/05/2021] [Indexed: 11/08/2022]
Abstract
Recent work by McAuley et al. (Attention, Perception, & Psychophysics, 82, 3222-3233, 2020) using the Coordinate Response Measure (CRM) paradigm with a multitalker background revealed that altering the natural rhythm of target speech amidst background speech worsens target recognition (a target-rhythm effect), while altering background speech rhythm improves target recognition (a background-rhythm effect). Here, we used a single-talker background to examine the role of specific properties of target and background sound patterns on selective listening without the complexity of multiple background stimuli. Experiment 1 manipulated the sex of the background talker, presented with a male target talker, to assess target and background-rhythm effects with and without a strong pitch cue to aid perceptual segregation. Experiment 2 used a vocoded single-talker background to examine target and background-rhythm effects with envelope-based speech rhythms preserved, but without semantic content or temporal fine structure. While a target-rhythm effect was present with all backgrounds, the background-rhythm effect was only observed for the same-sex background condition. Results provide additional support for a selective entrainment hypothesis, while also showing that the background-rhythm effect is not driven by envelope-based speech rhythm alone, and may be reduced or eliminated when pitch or other acoustic differences provide a strong basis for selective listening.
Collapse
Affiliation(s)
- J Devin McAuley
- Department of Psychology, Michigan State University, East Lansing, MI, 48824, USA.
| | - Yi Shen
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Toni Smith
- Department of Psychology, Michigan State University, East Lansing, MI, 48824, USA
| | - Gary R Kidd
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| |
Collapse
|
8
|
Wang X, Xu L. Speech perception in noise: Masking and unmasking. J Otol 2021; 16:109-119. [PMID: 33777124 PMCID: PMC7985001 DOI: 10.1016/j.joto.2020.12.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 12/03/2020] [Accepted: 12/06/2020] [Indexed: 11/23/2022] Open
Abstract
Speech perception is essential for daily communication. Background noise or concurrent talkers, on the other hand, can make it challenging for listeners to track the target speech (i.e., cocktail party problem). The present study reviews and compares existing findings on speech perception and unmasking in cocktail party listening environments in English and Mandarin Chinese. The review starts with an introduction section followed by related concepts of auditory masking. The next two sections review factors that release speech perception from masking in English and Mandarin Chinese, respectively. The last section presents an overall summary of the findings with comparisons between the two languages. Future research directions with respect to the difference in literature on the reviewed topic between the two languages are also discussed.
Collapse
Affiliation(s)
- Xianhui Wang
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| |
Collapse
|
9
|
Adel Ghahraman M, Ashrafi M, Mohammadkhani G, Jalaie S. Effects of aging on spatial hearing. Aging Clin Exp Res 2020; 32:733-739. [PMID: 31203530 DOI: 10.1007/s40520-019-01233-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 05/28/2019] [Indexed: 11/28/2022]
Abstract
BACKGROUND Aging has several effects on auditory processing with the most important effect known as speech perception impairment in noise. AIMS The aim of the present study was to investigate the effects of aging on spatial hearing using quick speech in noise (QSIN) and binaural masking level difference (BMLD) tests and speech, spatial, and qualities of hearing scale (SSQ) questionnaire. METHODS The study was carried out on 34 elderly people, aged 60-75 years, with normal peripheral hearing and 34 young participants, aged 18-25 years. Using SSQ questionnaire and QSIN and BMLD tests, the spatial auditory processing ability was compared between the two groups. RESULTS Comparison of mean scores using independent t test showed that there was a significant difference in the mean scores of QSIN, BMLD tests and SSQ questionnaire between the two groups (p < 0.001). Sex was not found to have any effect on the results (p > 0.05). DISCUSSION Structural and neurochemical changes that occur in different parts of the central nervous system by aging affect various aspects of spatial auditory processing, such as localization, the precedence effect, and speech perception in noise. CONCLUSIONS Lower scores of older adults with normal hearing in SSQ questionnaire and behavioral tests, compared with younger participants, may be considered as their weak performance in spatial auditory processing. The results of the present study reconfirm the effects of aging on spatial auditory processing, such as localization and speech perception in noise.
Collapse
Affiliation(s)
- Mansoureh Adel Ghahraman
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Tehran, Iran
| | - Majid Ashrafi
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Tehran, Iran
| | - Ghassem Mohammadkhani
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Tehran, Iran.
| | - Shohreh Jalaie
- Biostatistics, School of Rehabilitation, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
10
|
Rouhbakhsh N, Mahdi J, Hwo J, Nobel B, Mousave F. Human Frequency Following Response Correlates of Spatial Release From Masking. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:4165-4178. [PMID: 31644365 DOI: 10.1044/2019_jslhr-h-18-0353] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Purpose Speech recognition in complex listening environments is enhanced by the extent of spatial separation between the speech source and background competing sources, an effect known as spatial release from masking (SRM). The aim of this study was to investigate whether the phase-locked neural activity in the central auditory pathways, reflected in the frequency following response (FFR), exhibits SRM. Method Eighteen normal-hearing adults (8 men and 10 women, ranging in age from 20 to 42 years) with no known neurological disorders participated in this study. FFRs were recorded from the participants in response to a target vowel /u/ presented with spatially colocated and separated competing talkers at 3 ranges of signal-to-noise ratios (SNRs), with median SNRs of -5.4, 0.5, and 6.8 dB and for different attentional conditions (attention and no attention). Results Amplitude of the FFR at the fundamental frequency was significantly larger in the spatially separated condition as compared to the colocated condition for only the lowest (< -2.4 dB SNR) of the 3 SNR ranges tested. A significant effect of attention was found when subjects were actively focusing on the target stimuli. No significant interaction effects were found between spatial separation and attention. Conclusions The enhanced representation of the target stimulus in the separated condition suggests that the temporal pattern of phase-locked brainstem neural activity generating the FFR may contain information relevant to the binaural processes underlying SRM but only in challenging listening environments. Attention may modulate FFR fundamental frequency amplitude but does not seem to modulate spatial processing at the level of generating the FFR. Supplemental Material https://doi.org/10.23641/asha.9992597.
Collapse
Affiliation(s)
- Nematollah Rouhbakhsh
- HEARing Cooperation Research Centre, Melbourne, Victoria, Australia
- University of Melbourne, Victoria, Australia
- National Acoustic Laboratories, Australian Hearing Hub, Macquarie University, Sydney, New South Wales, Australia
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Iran
| | - John Mahdi
- The New York Academy of Sciences, New York
| | - Jacob Hwo
- Faculty of Medicine and Health, Department of Biomedical Science, The University of Sydney, New South Wales, Australia
| | - Baran Nobel
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, St. Lucia, Australia
| | - Fati Mousave
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, St. Lucia, Australia
| |
Collapse
|
11
|
Rouhbakhsh N, Mahdi J, Hwo J, Nobel B, Mousave F. Spatial hearing processing: electrophysiological documentation at subcortical and cortical levels. Int J Neurosci 2019; 129:1119-1132. [DOI: 10.1080/00207454.2019.1635129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Nematollah Rouhbakhsh
- HEARing Cooperation Research Centre, Melbourne, Australia
- Department of Audiology and Speech Pathology, School of Health Sciences, University of Melbourne, Melbourne, Australia
- National Acoustic Laboratories, Australian Hearing Hub, Macquarie University, Sydney, Australia
- Department of Audiology, School of Rehabilitation, Tehran University of Medical Sciences, Pich-e Shemiran, Tehran, Iran
| | - John Mahdi
- The New York Academy of Sciences, New York, NY, USA
| | - Jacob Hwo
- Department of Biomedical Science, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Baran Nobel
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, Queensland, Australia
| | - Fati Mousave
- Department of Audiology, School of Health and Rehabilitation Sciences, The University of Queensland, Queensland, Australia
| |
Collapse
|
12
|
Brandewie EJ, Zahorik P. Speech intelligibility in rooms: Disrupting the effect of prior listening exposure. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:3068. [PMID: 29857737 PMCID: PMC5966308 DOI: 10.1121/1.5038278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Revised: 04/26/2018] [Accepted: 05/02/2018] [Indexed: 06/08/2023]
Abstract
It has been demonstrated that prior listening exposure to reverberant environments can improve speech understanding in that environment. Previous studies have shown that the buildup of this effect is brief (less than 1 s) and seems largely to be elicited by exposure to the temporal modulation characteristics of the room environment. Situations that might be expected to cause a disruption in this process have yet to be demonstrated. This study seeks to address this issue by showing what types of changes in the acoustic environment cause a breakdown of the room exposure phenomenon. Using speech carrier phrases featuring sudden changes in the acoustic environment, breakdown in the room exposure effect was observed when there was change in the late reverberation characteristics of the room that signaled a different room environment. Changes in patterns of early reflections within the same room environment did not elicit breakdown. Because the environmental situations that resulted in breakdown also resulted in substantial changes to the broadband temporal modulation characteristic of the signal reaching the ears, results from this study provide additional support for the hypothesis that the room exposure phenomenon is linked to the temporal modulation characteristics of the environment.
Collapse
Affiliation(s)
- Eugene J Brandewie
- Center for Applied and Translational Sensory Science, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Pavel Zahorik
- Department of Otolaryngology and Communicative Disorders, University of Louisville, Louisville, Kentucky 40292, USA
| |
Collapse
|
13
|
Davis TJ, Gifford RH. Spatial Release From Masking in Adults With Bilateral Cochlear Implants: Effects of Distracter Azimuth and Microphone Location. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:752-761. [PMID: 29450488 PMCID: PMC5963045 DOI: 10.1044/2017_jslhr-h-16-0441] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Revised: 08/20/2017] [Accepted: 10/04/2017] [Indexed: 06/01/2023]
Abstract
PURPOSE The primary purpose of this study was to derive spatial release from masking (SRM) performance-azimuth functions for bilateral cochlear implant (CI) users to provide a thorough description of SRM as a function of target/distracter spatial configuration. The secondary purpose of this study was to investigate the effect of the microphone location for SRM in a within-subject study design. METHOD Speech recognition was measured in 12 adults with bilateral CIs for 11 spatial separations ranging from -90° to +90° in 20° steps using an adaptive block design. Five of the 12 participants were tested with both the behind-the-ear microphones and a T-mic configuration to further investigate the effect of mic location on SRM. RESULTS SRM can be significantly affected by the hemifield origin of the distracter stimulus-particularly for listeners with interaural asymmetry in speech understanding. The greatest SRM was observed with a distracter positioned 50° away from the target. There was no effect of mic location on SRM for the current experimental design. CONCLUSION Our results demonstrate that the traditional assessment of SRM with a distracter positioned at 90° azimuth may underestimate maximum performance for individuals with bilateral CIs.
Collapse
Affiliation(s)
- Timothy J. Davis
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN
| | - René H. Gifford
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN
| |
Collapse
|
14
|
Pastore MT, Yost WA. Spatial Release from Masking with a Moving Target. Front Psychol 2017; 8:2238. [PMID: 29326638 PMCID: PMC5742351 DOI: 10.3389/fpsyg.2017.02238] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 12/11/2017] [Indexed: 11/13/2022] Open
Abstract
In the visual domain, a stationary object that is difficult to detect usually becomes far more salient if it moves while the objects around it do not. This “pop out” effect is important for parsing the visual world into figure/ground relationships that allow creatures to detect food, threats, etc. We tested for an auditory correlate to this visual effect by asking listeners to identify a single word, spoken by a female, embedded with two or four masking words spoken by males. Percentage correct scores were analyzed and compared between conditions where target and maskers were presented from the same position vs. when the target was presented from one position while maskers were presented from different positions. In some trials, the target word was moved across the speaker array using amplitude panning, while in other trials that target was played from a single, static position. Results showed a spatial release from masking for all conditions where the target and maskers were not located at the same position, but there was no statistically significant difference between identification performance when the target was moving vs. when it was stationary. These results suggest that, at least for short stimulus durations (0.75 s for the stimuli in this experiment), there is unlikely to be a “pop out” effect for moving target stimuli in the auditory domain as there is in the visual domain.
Collapse
Affiliation(s)
- M Torben Pastore
- Department of Speech and Hearing Science, Arizona State University, Tempe, AZ, United States
| | - William A Yost
- Department of Speech and Hearing Science, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
15
|
Davis TJ, Grantham DW, Gifford RH. Effect of motion on speech recognition. Hear Res 2016; 337:80-8. [PMID: 27240478 DOI: 10.1016/j.heares.2016.05.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Revised: 04/11/2016] [Accepted: 05/08/2016] [Indexed: 11/16/2022]
Abstract
The benefit of spatial separation for talkers in a multi-talker environment is well documented. However, few studies have examined the effect of talker motion on speech recognition. In the current study, we evaluated the effects of (1) motion of the target or distracters, (2) a priori information about the target and distracter spatial configurations, and (3) target and distracter location. In total, seventeen young adults with normal hearing were tested in a large anechoic chamber in two experiments. In Experiment 1, seven stimulus conditions were tested using the Coordinate Response Measure (Bolia et al., 2000) speech corpus, in which subjects were required to report the key words in a target sentence presented simultaneously with two distracter sentences. As in previous studies, there was a significant improvement in key word identification for conditions in which the target and distracters were spatially separated as compared to the co-located conditions. In addition, 1) motion of either talker or distracter resulted in improved performance compared to stationary presentation (talker motion yielded significantly better performance than distracter motion) 2) a priori information regarding stimulus configuration was not beneficial, and 3) performance was significantly better with key words at 0° azimuth as compared to -60° (on the listener's left). Experiment 2 included two additional conditions designed to assess whether the benefit of motion observed in Experiment 1 was due to the motion itself or to the fact that the motion conditions introduced small spatial separations in the target and distracter key words. Results showed that small spatial separations (on the order of 5-8°) resulted in improved performance (relative to co-located key words) whether the sentences were moving or stationary. These results suggest that in the presence of distracting messages, motion of either target or distracters and/or small spatial separation of the key words may be beneficial for sound source segregation and thus for improved speech recognition.
Collapse
Affiliation(s)
- Timothy J Davis
- Vanderbilt University, Department of Hearing and Speech Sciences, Nashville, TN, USA.
| | - D Wesley Grantham
- Vanderbilt University, Department of Hearing and Speech Sciences, Nashville, TN, USA
| | - René H Gifford
- Vanderbilt University, Department of Hearing and Speech Sciences, Nashville, TN, USA
| |
Collapse
|
16
|
Lin G, Carlile S. Costs of switching auditory spatial attention in following conversational turn-taking. Front Neurosci 2015; 9:124. [PMID: 25941466 PMCID: PMC4403343 DOI: 10.3389/fnins.2015.00124] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2015] [Accepted: 03/26/2015] [Indexed: 11/17/2022] Open
Abstract
Following a multi-talker conversation relies on the ability to rapidly and efficiently shift the focus of spatial attention from one talker to another. The current study investigated the listening costs associated with shifts in spatial attention during conversational turn-taking in 16 normally-hearing listeners using a novel sentence recall task. Three pairs of syntactically fixed but semantically unpredictable matrix sentences, recorded from a single male talker, were presented concurrently through an array of three loudspeakers (directly ahead and +/−30° azimuth). Subjects attended to one spatial location, cued by a tone, and followed the target conversation from one sentence to the next using the call-sign at the beginning of each sentence. Subjects were required to report the last three words of each sentence (speech recall task) or answer multiple choice questions related to the target material (speech comprehension task). The reading span test, attention network test, and trail making test were also administered to assess working memory, attentional control, and executive function. There was a 10.7 ± 1.3% decrease in word recall, a pronounced primacy effect, and a rise in masker confusion errors and word omissions when the target switched location between sentences. Switching costs were independent of the location, direction, and angular size of the spatial shift but did appear to be load dependent and only significant for complex questions requiring multiple cognitive operations. Reading span scores were positively correlated with total words recalled, and negatively correlated with switching costs and word omissions. Task switching speed (Trail-B time) was also significantly correlated with recall accuracy. Overall, this study highlights (i) the listening costs associated with shifts in spatial attention and (ii) the important role of working memory in maintaining goal relevant information and extracting meaning from dynamic multi-talker conversations.
Collapse
Affiliation(s)
- Gaven Lin
- Auditory Neuroscience Laboratory, Department of Physiology, School of Medical Sciences, University of Sydney Sydney, NSW, Australia
| | - Simon Carlile
- Auditory Neuroscience Laboratory, Department of Physiology, School of Medical Sciences, University of Sydney Sydney, NSW, Australia
| |
Collapse
|
17
|
Xia J, Nooraei N, Kalluri S, Edwards B. Spatial release of cognitive load measured in a dual-task paradigm in normal-hearing and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:1888-1898. [PMID: 25920841 DOI: 10.1121/1.4916599] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
This study investigated whether spatial separation between talkers helps reduce cognitive processing load, and how hearing impairment interacts with the cognitive load of individuals listening in multi-talker environments. A dual-task paradigm was used in which performance on a secondary task (visual tracking) served as a measure of the cognitive load imposed by a speech recognition task. Visual tracking performance was measured under four conditions in which the target and the interferers were distinguished by (1) gender and spatial location, (2) gender only, (3) spatial location only, and (4) neither gender nor spatial location. Results showed that when gender cues were available, a 15° spatial separation between talkers reduced the cognitive load of listening even though it did not provide further improvement in speech recognition (Experiment I). Compared to normal-hearing listeners, large individual variability in spatial release of cognitive load was observed among hearing-impaired listeners. Cognitive load was lower when talkers were spatially separated by 60° than when talkers were of different genders, even though speech recognition was comparable in these two conditions (Experiment II). These results suggest that a measure of cognitive load might provide valuable insight into the benefit of spatial cues in multi-talker environments.
Collapse
Affiliation(s)
- Jing Xia
- Starkey Hearing Research Center, 2150 Shattuck Avenue, Suite 408, Berkeley, California 94704
| | - Nazanin Nooraei
- Starkey Hearing Research Center, 2150 Shattuck Avenue, Suite 408, Berkeley, California 94704
| | - Sridhar Kalluri
- Starkey Hearing Research Center, 2150 Shattuck Avenue, Suite 408, Berkeley, California 94704
| | - Brent Edwards
- Starkey Hearing Research Center, 2150 Shattuck Avenue, Suite 408, Berkeley, California 94704
| |
Collapse
|
18
|
Getzmann S, Lewald J, Falkenstein M. Using auditory pre-information to solve the cocktail-party problem: electrophysiological evidence for age-specific differences. Front Neurosci 2014; 8:413. [PMID: 25540608 PMCID: PMC4261705 DOI: 10.3389/fnins.2014.00413] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 11/24/2014] [Indexed: 11/13/2022] Open
Abstract
Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called “cocktail-party” problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments.
Collapse
Affiliation(s)
- Stephan Getzmann
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors, Technical University of Dortmund (IfADo) Dortmund, Germany
| | - Jörg Lewald
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors, Technical University of Dortmund (IfADo) Dortmund, Germany ; Faculty of Psychology, Ruhr-University Bochum Bochum, Germany
| | - Michael Falkenstein
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors, Technical University of Dortmund (IfADo) Dortmund, Germany
| |
Collapse
|
19
|
Mitigation of informational masking in individuals with single-sided deafness by integrated bone conduction hearing aids. Ear Hear 2014; 35:41-8. [PMID: 24067501 DOI: 10.1097/aud.0b013e31829d14e8] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES To confirm an increased susceptibility to informational masking among individuals with single-sided deafness (SSD). To demonstrate a reduction in informational masking when SSD is treated with an integrated bone conduction hearing aid (IBC). To identify the acoustic cues that contribute to IBC-aided masking release. To determine the effects of device experience on the IBC advantage. DESIGN Informational masking was evaluated with the coordinate-response measure. Participants performed the task by reporting color and number coordinates that changed randomly within target sentences. The target sentences were presented in free field accompanied by zero to three distracting sentences. Target and distracting sentences were spoken by different talkers and originated from different source locations, creating two sources of information for auditory streaming. Susceptibility to informational masking was inferred from the error rates of unaided SSD patients relative to normal controls. These baseline measures were derived by testing inexperienced IBC users without the device on the day of their initial fitting. The benefits of IBC-aided listening were assessed by measuring the aided performance of users who had at least 3 months' device experience. The acoustic basis of the listening advantage was isolated by correlating response errors with the voice pitch and location of distracting sentences. The effects of learning on cue effectiveness were evaluated by comparing the error rates of experienced and inexperienced users. RESULTS Unaided SSD participants (inexperienced users) performed as well as normal controls when tested without distracting sentences but produced significantly higher error rates when tested with distracting sentences. Most errors involved responding with coordinates that were contained in distracting sentences. This increased susceptibility to informational masking was significantly reduced when experienced IBC users were tested with the device. The listening advantage was most strongly correlated with the availability of voice pitch cues, although performance was also influenced by the location of distracting sentences. Directional asymmetries appear to be dictated by location-dependent cues that are derived from the distinctive transmission characteristics of IBC stimulation. Experienced users made better use of these cues than inexperienced users. CONCLUSIONS These results suggest that informational masking is a significant source of communication impairment among individuals with SSD. Despite the lateralization of auditory function, unaided SSD subjects experience informational masking when distractors occur in either the deaf or normal spatial hemifield. Restoration of aural sensitivity in the deaf hemifield with an IBC enhances speech intelligibility under complex listening conditions, presumably by providing additional sound-segregation cues that are derived from voice pitch and spatial location. The optimal use of these cues is not immediate, but a significant listening advantage is observed after 3 months of unstructured use.
Collapse
|
20
|
Glyde H, Buchholz JM, Dillon H, Cameron S, Hickson L. The importance of interaural time differences and level differences in spatial release from masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:EL147-52. [PMID: 23927217 DOI: 10.1121/1.4812441] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Numerous studies have described improvements in speech understanding when interaural time differences (ITDs) and interaural level differences (ILDs) are present. The present study aimed to investigate whether either cue in isolation can elicit spatial release from masking (SRM) in a speech-on-speech masking paradigm with maskers positioned symmetrically around the listener. Twelve adults were tested using three presentations of the Listening in Spatialized Noise-Sentences Test, with each presentation modified to contain different interaural cues in the stimuli. Results suggest that ILDs provide a similar amount of SRM as ITDs and ILDs combined. ITDs alone provide significantly less benefit.
Collapse
Affiliation(s)
- Helen Glyde
- The HEARing Cooperative Research Centre, 550 Swanston Street, Carlton, Victoria 3053, Australia.
| | | | | | | | | |
Collapse
|
21
|
Lee JH, Humes LE. Effect of fundamental-frequency and sentence-onset differences on speech-identification performance of young and older adults in a competing-talker background. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:1700-17. [PMID: 22978898 PMCID: PMC3460987 DOI: 10.1121/1.4740482] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
This study investigated the benefits of differences between sentences in fundamental frequency (F0) and temporal onset for sentence pairs among listener groups differing in age and hearing sensitivity. Two experiments were completed with the primary difference between experiments being the way in which the stimuli were presented. Experiment 1 used blocked stimulus presentation, which ultimately provided redundant acoustic cues to mark the target sentence in each pair, whereas Experiment 2 sampled a slightly more restricted stimulus space, but in a completely randomized presentation order. For both experiments, listeners were required to detect a cue word ("Baron") for the target sentence in each pair and to then identify the target words (color, number) that appeared later in the target sentence. Results of Experiment 1 showed that F0 or onset separation cues were beneficial to both cue-word detection and color-number identification performance. There were no significant differences across groups in the ability to detect the cue word, but groups differed in their ability to identify the correct color-number words. Elderly adults with impaired hearing had the greatest difficulty with the identification task despite the application of spectral shaping to restore the audibility of the speech stimuli. For the most part, the primary results of Experiment 1 were replicated in Experiment 2, although, in the latter experiment, all older adults, whether they had normal or impaired hearing, performed worse than young adults with normal hearing. From Experiment 2, the benefits received for a difference in F0 between talkers of 6 semitones were equivalent to those received for an onset asynchrony of 300 ms between sentences and, for such conditions, the combination of both sound-segregation cues resulted in an additive benefit.
Collapse
Affiliation(s)
- Jae Hee Lee
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana 47405, USA.
| | | |
Collapse
|
22
|
Allen K, Alais D, Carlile S. A Collection of Pseudo-Words to Study Multi-Talker Speech Intelligibility without Shifts of Spatial Attention. Front Psychol 2012; 3:49. [PMID: 22435061 PMCID: PMC3304086 DOI: 10.3389/fpsyg.2012.00049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2011] [Accepted: 02/08/2012] [Indexed: 11/13/2022] Open
Abstract
A new collection of pseudo-words was recorded from a single female speaker of American English for use in multi-talker speech intelligibility research. The pseudo-words (known as the KARG collection) consist of three groups of single syllable pseudo-words varying only by the initial phoneme. The KARG method allows speech intelligibility to be studied free of the influence of shifts of spatial attention from one loudspeaker location to another in multi-talker contexts. To achieve this, all KARG pseudo-words share the same concluding rimes, with only the first phoneme serving as a distinguishing identifier. This ensures that listeners are unable to correctly identify the target pseudo-word without hearing the initial phoneme. As the duration of all the initial phonemes are brief, much shorter than the time required to spatially shift attention, the KARG method assesses speech intelligibility without the confound of shifting spatial attention. The KARG collection is available free for research purposes.
Collapse
Affiliation(s)
- Kachina Allen
- Auditory, Brain and Cognitive Development Laboratory, McGill University Montreal, QC, Canada
| | | | | |
Collapse
|
23
|
Glyde H, Hickson L, Cameron S, Dillon H. Problems hearing in noise in older adults: a review of spatial processing disorder. Trends Amplif 2011; 15:116-26. [PMID: 22072599 DOI: 10.1177/1084713811424885] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Difficulty understanding speech in background noise, even with amplification to restore audibility, is a common problem for hearing-impaired individuals and is especially frequent in older adults. Despite the debilitating nature of the problem the cause is not yet completely clear. This review considers the role of spatial processing ability in understanding speech in noise, highlights the potential impact of disordered spatial processing, and attempts to establish if aging leads to reduced spatial processing ability. Evidence supporting and opposing the hypothesis that spatial processing is disordered among the aging population is presented. With a few notable exceptions, spatial processing ability was shown to be reduced in an older population in comparison to young adults, leading to poorer speech understanding in noise. However, it is argued that to conclude aging negatively effects spatial processing ability may be oversimplified or even premature given potentially confounding factors such as cognitive ability and hearing impairment. Further research is required to determine the effect of aging and hearing impairment on spatial processing and to investigate possible remediation options for spatial processing disorder.
Collapse
Affiliation(s)
- Helen Glyde
- HEARing Cooperative Research Centre, Carlton, Australia.
| | | | | | | |
Collapse
|
24
|
Best V, Mason CR, Kidd G. Spatial release from masking in normally hearing and hearing-impaired listeners as a function of the temporal overlap of competing talkers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:1616-25. [PMID: 21428524 PMCID: PMC3078033 DOI: 10.1121/1.3533733] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2010] [Revised: 12/07/2010] [Accepted: 12/13/2010] [Indexed: 05/21/2023]
Abstract
Listeners with sensorineural hearing loss are poorer than listeners with normal hearing at understanding one talker in the presence of another. This deficit is more pronounced when competing talkers are spatially separated, implying a reduced "spatial benefit" in hearing-impaired listeners. This study tested the hypothesis that this deficit is due to increased masking specifically during the simultaneous portions of competing speech signals. Monosyllabic words were compressed to a uniform duration and concatenated to create target and masker sentences with three levels of temporal overlap: 0% (non-overlapping in time), 50% (partially overlapping), or 100% (completely overlapping). Listeners with hearing loss performed particularly poorly in the 100% overlap condition, consistent with the idea that simultaneous speech sounds are most problematic for these listeners. However, spatial release from masking was reduced in all overlap conditions, suggesting that increased masking during periods of temporal overlap is only one factor limiting spatial unmasking in hearing-impaired listeners.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA.
| | | | | |
Collapse
|
25
|
Kidd G, Mason CR, Best V, Marrone N. Stimulus factors influencing spatial release from speech-on-speech masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:1965-78. [PMID: 20968368 PMCID: PMC2981113 DOI: 10.1121/1.3478781] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
This study examined spatial release from masking (SRM) when a target talker was masked by competing talkers or by other types of sounds. The focus was on the role of interaural time differences (ITDs) and time-varying interaural level differences (ILDs) under conditions varying in the strength of informational masking (IM). In the first experiment, a target talker was masked by two other talkers that were either colocated with the target or were symmetrically spatially separated from the target with the stimuli presented through loudspeakers. The sounds were filtered into different frequency regions to restrict the available interaural cues. The largest SRM occurred for the broadband condition followed by a low-pass condition. However, even the highest frequency bandpass-filtered condition (3-6 kHz) yielded a significant SRM. In the second experiment the stimuli were presented via earphones. The listeners identified the speech of a target talker masked by one or two other talkers or noises when the maskers were colocated with the target or were perceptually separated by ITDs. The results revealed a complex pattern of masking in which the factors affecting performance in colocated and spatially separated conditions are to a large degree independent.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language and Hearing Sciences, and Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| | | | | | | |
Collapse
|
26
|
Kitterick PT, Bailey PJ, Summerfield AQ. Benefits of knowing who, where, and when in multi-talker listening. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:2498-2508. [PMID: 20370032 DOI: 10.1121/1.3327507] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The benefits of prior information about who would speak, where they would be located, and when they would speak were measured in a multi-talker spatial-listening task. On each trial, a target phrase and several masker phrases were allocated to 13 loudspeakers in a 180 degrees arc, and to 13 overlapping time slots, which started every 800 ms. Speech-reception thresholds (SRTs) were measured as the level of target relative to masker phrases at which listeners reported key words at 71% correct. When phases started in pairs all three cues were beneficial ("who" 3.2 dB, "where" 5.1 dB, and "when" 0.3 dB). Over a range of onset asynchronies, SRTs corresponded consistently to a signal-to-noise ratio (SNR) of -2 dB at the start of the target phrase. When phrases started one at a time, SRTs fell to a SNR of -8 dB and were improved significantly, but only marginally, by constraining "who" (1.9 dB), and not by constraining "where" (1.0 dB) or "when" (0.01 dB). Thus, prior information about "who," "where," and "when" was beneficial, but only when talkers started speaking in pairs. Low SRTs may arise when talkers start speaking one at a time because of automatic orienting to phrase onsets and/or the use of loudness differences to distinguish target from masker phrases.
Collapse
Affiliation(s)
- Pádraig T Kitterick
- Department of Psychology, University of York, York YO10 5DD, United Kingdom.
| | | | | |
Collapse
|
27
|
Current World Literature. Curr Opin Otolaryngol Head Neck Surg 2008; 16:490-5. [DOI: 10.1097/moo.0b013e3283130f63] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
28
|
Ihlefeld A, Shinn-Cunningham B. Disentangling the effects of spatial cues on selection and formation of auditory objects. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:2224-2235. [PMID: 19062861 PMCID: PMC9014243 DOI: 10.1121/1.2973185] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2006] [Revised: 07/08/2008] [Accepted: 07/16/2008] [Indexed: 05/27/2023]
Abstract
When competing sources come from different directions, a desired target is easier to hear than when the sources are co-located. How much of this improvement is the result of spatial attention rather than improved perceptual segregation of the competing sources is not well understood. Here, listeners' attention was directed to spatial or nonspatial cues when they listened for a target masked by a competing message. A preceding cue signaled the target timbre, location, or both timbre and location. Spatial separation improved performance when the cue indicated the target location, or both the location and timbre, but not when the cue only indicated the target timbre. However, response errors were influenced by spatial configuration in all conditions. Both attention and streaming contributed to spatial effects when listeners actively attended to location. In contrast, when attention was directed to a nonspatial cue, spatial separation primarily appeared to improve the streaming of auditory objects across time. Thus, when attention is focused on location, spatial separation appears to improve both object selection and object formation; when attention is directed to nonspatial cues, separation affects object formation. These results highlight the need to distinguish between these separate mechanisms when considering how observers cope with complex auditory scenes.
Collapse
Affiliation(s)
- Antje Ihlefeld
- Auditory Neuroscience Laboratory, Boston University Hearing Research Center, 677 Beacon Street, Boston, Massachusetts 02215, USA
| | | |
Collapse
|