1
|
Kumar S, Nayak S, Kanagokar V, Pitchai Muthu AN. Does bilateral hearing aid fitting improve spatial hearing ability: a systematic review and meta-analysis. Disabil Rehabil Assist Technol 2024; 19:2729-2741. [PMID: 38385777 DOI: 10.1080/17483107.2024.2316293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 01/31/2024] [Accepted: 02/03/2024] [Indexed: 02/23/2024]
Abstract
OBJECTIVES The ability to localize sound sources is crucial for everyday listening, as it contributes to spatial awareness and the detection of warning signs. Individuals with hearing impairment have poorer localization abilities, which further deteriorate when they are fitted with a hearing aid. Although numerous studies have addressed this phenomenon, there is a lack of systematic evidence. The aim of the current systematic review is to address the following research question, "Do behavioural measures of spatial hearing ability improve with bilateral hearing aid fitting compared to the unaided hearing condition?" DESIGN A comprehensive search was conducted by two independent authors utilizing electronic databases, using various electronic databases, covering the period of 1965 to 2022. The inclusion and exclusion criteria were formulated using the Population, Intervention, Compression, Outcome, and Study design (PICOS) format, and the certainty of evidence was determined using the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) guidelines. RESULTS The comprehensive search resulted in 2199 studies, 17 studies for qualitative synthesis and 15 studies for quantitative synthesis. The collected data was divided into two groups, namely vertical and horizontal localization. The results of the quantitative analysis indicate that the localization performance was significantly better in the unaided condition for both vertical and horizontal planes. The certainty of our evidence was judged to be moderate, meaning that "we are moderately confident in the effect estimate. The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different". CONCLUSION The review findings demonstrate that the bilateral fitting of the hearing aid did not effectively preserve spatial cues, which resulted in poorer localization performance irrespective of the plane of assessment. REVIEW REGISTRATION Prospective Register of Systematic Reviews (PROSPERO); CRD42022358164.
Collapse
Affiliation(s)
- Sathish Kumar
- Department of Audiology and Speech-Language Pathology, Kasturba Medical College Mangalore, Manipal Academy of Higher Education, Manipal, India
| | - Srikanth Nayak
- Department of Audiology and Speech-Language Pathology, Yenepoya Medical College, Yenepoya University (Deemed to be University), Mangalore, India
| | - Vibha Kanagokar
- Department of Audiology and Speech-Language Pathology, Kasturba Medical College Mangalore, Manipal Academy of Higher Education, Manipal, India
| | - Arivudai Nambi Pitchai Muthu
- Department of Audiology and Speech-Language Pathology, Kasturba Medical College Mangalore, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
2
|
Carlini A, Bordeau C, Ambard M. Auditory localization: a comprehensive practical review. Front Psychol 2024; 15:1408073. [PMID: 39049946 PMCID: PMC11267622 DOI: 10.3389/fpsyg.2024.1408073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 06/17/2024] [Indexed: 07/27/2024] Open
Abstract
Auditory localization is a fundamental ability that allows to perceive the spatial location of a sound source in the environment. The present work aims to provide a comprehensive overview of the mechanisms and acoustic cues used by the human perceptual system to achieve such accurate auditory localization. Acoustic cues are derived from the physical properties of sound waves, and many factors allow and influence auditory localization abilities. This review presents the monaural and binaural perceptual mechanisms involved in auditory localization in the three dimensions. Besides the main mechanisms of Interaural Time Difference, Interaural Level Difference and Head Related Transfer Function, secondary important elements such as reverberation and motion, are also analyzed. For each mechanism, the perceptual limits of localization abilities are presented. A section is specifically devoted to reference systems in space, and to the pointing methods used in experimental research. Finally, some cases of misperception and auditory illusion are described. More than a simple description of the perceptual mechanisms underlying localization, this paper is intended to provide also practical information available for experiments and work in the auditory field.
Collapse
|
3
|
Day ML. Head-related transfer functions of rabbits within the front horizontal plane. Hear Res 2024; 441:108924. [PMID: 38061267 PMCID: PMC10872353 DOI: 10.1016/j.heares.2023.108924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 11/10/2023] [Accepted: 11/22/2023] [Indexed: 12/19/2023]
Abstract
The head-related transfer function (HRTF) describes the direction-dependent acoustic filtering by the head that occurs between a source signal in free-field space and the signal at the tympanic membrane. HRTFs contain information on sound source location via interaural differences of their magnitude or phase spectra and via the shapes of their magnitude spectra. The present study characterized HRTFs for source locations in the front horizontal plane for nine rabbits, which are a species commonly used in studies of the central auditory system. HRTF magnitude spectra shared several features across individuals, including a broad spectral peak at 2.6kHz that increased gain by 12 to 23dB depending on source azimuth; and a notch at 7.6kHz and peak at 9.8kHz visible for most azimuths. Overall, frequencies above 4kHz were amplified for sources ipsilateral to the ear and progressively attenuated for frontal and contralateral azimuths. The slope of the magnitude spectrum between 3 and 5kHz was found to be an unambiguous monaural cue for source azimuths ipsilateral to the ear. Average interaural level difference (ILD) between 5 and 16kHz varied monotonically with azimuth over ±31dB despite a relatively small head size. Interaural time differences (ITDs) at 0.5kHz and 1.5kHz also varied monotonically with azimuth over ±358 μs and ±260 μs, respectively. Remeasurement of HRTFs after pinna removal revealed that the large pinnae of rabbits were responsible for all spectral peaks and notches in magnitude spectra and were the main contribution to high-frequency ILDs (5-16kHz), whereas the rest of the head was the main contribution to ITDs and low-frequency ILDs (0.2-1.5kHz). Lastly, inter-individual differences in magnitude spectra were found to be small enough that deviations of individual HRTFs from an average HRTF were comparable in size to measurement error. Therefore, the average HRTF may be acceptable for use in neural or behavioral studies of rabbits implementing virtual acoustic space when measurement of individualized HRTFs is not possible.
Collapse
Affiliation(s)
- Mitchell L Day
- Department of Biological Sciences, Ohio University, Athens, OH 45701, USA.
| |
Collapse
|
4
|
Day ML. Head-related transfer functions of rabbits within the front horizontal plane. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.15.557943. [PMID: 37745541 PMCID: PMC10516025 DOI: 10.1101/2023.09.15.557943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
The head-related transfer function (HRTF) is the direction-dependent acoustic filtering by the head that occurs between a source signal in free-field space and the signal at the tympanic membrane. HRTFs contain information on sound source location via interaural differences of their magnitude or phase spectra and via the shapes of their magnitude spectra. The present study characterized HRTFs for source locations in the front horizontal plane for nine rabbits, which are a species commonly used in studies of the central auditory system. HRTF magnitude spectra shared several features across individuals, including a broad spectral peak at 2.6 kHz that increased gain by 12 to 23 dB depending on source azimuth; and a notch at 7.6 kHz and peak at 9.8 kHz visible for most azimuths. Overall, frequencies above 4 kHz were amplified for sources ipsilateral to the ear and progressively attenuated for frontal and contralateral azimuths. The slope of the magnitude spectrum between 3 and 5 kHz was found to be an unambiguous monaural cue for source azimuths ipsilateral to the ear. Average interaural level difference (ILD) between 5 and 16 kHz varied monotonically with azimuth over ±31 dB despite a relatively small head size. Interaural time differences (ITDs) at 0.5 kHz and 1.5 kHz also varied monotonically with azimuth over ±358 μs and ±260 μs, respectively. Remeasurement of HRTFs after pinna removal revealed that the large pinnae of rabbits were responsible for all spectral peaks and notches in magnitude spectra and were the main contribution to high-frequency ILDs, whereas the rest of the head was the main contribution to ITDs and low-frequency ILDs. Lastly, inter-individual differences in magnitude spectra were found to be small enough that deviations of individual HRTFs from an average HRTF were comparable in size to measurement error. Therefore, the average HRTF may be acceptable for use in neural or behavioral studies of rabbits implementing virtual acoustic space when measurement of individualized HRTFs is not possible.
Collapse
|
5
|
McLachlan G, Majdak P, Reijniers J, Mihocic M, Peremans H. Dynamic spectral cues do not affect human sound localization during small head movements. Front Neurosci 2023; 17:1027827. [PMID: 36816108 PMCID: PMC9936143 DOI: 10.3389/fnins.2023.1027827] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 01/16/2023] [Indexed: 02/05/2023] Open
Abstract
Natural listening involves a constant deployment of small head movement. Spatial listening is facilitated by head movements, especially when resolving front-back confusions, an otherwise common issue during sound localization under head-still conditions. The present study investigated which acoustic cues are utilized by human listeners to localize sounds using small head movements (below ±10° around the center). Seven normal-hearing subjects participated in a sound localization experiment in a virtual reality environment. Four acoustic cue stimulus conditions were presented (full spectrum, flattened spectrum, frozen spectrum, free-field) under three movement conditions (no movement, head rotations over the yaw axis and over the pitch axis). Localization performance was assessed using three metrics: lateral and polar precision error and front-back confusion rate. Analysis through mixed-effects models showed that even small yaw rotations provide a remarkable decrease in front-back confusion rate, whereas pitch rotations did not show much of an effect. Furthermore, MSS cues improved localization performance even in the presence of dITD cues. However, performance was similar between stimuli with and without dMSS cues. This indicates that human listeners utilize the MSS cues before the head moves, but do not rely on dMSS cues to localize sounds when utilizing small head movements.
Collapse
Affiliation(s)
- Glen McLachlan
- Department of Engineering Management, University of Antwerp, Antwerp, Belgium,*Correspondence: Glen McLachlan ✉
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Jonas Reijniers
- Department of Engineering Management, University of Antwerp, Antwerp, Belgium
| | - Michael Mihocic
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Herbert Peremans
- Department of Engineering Management, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
6
|
Deep neural network models of sound localization reveal how perception is adapted to real-world environments. Nat Hum Behav 2022; 6:111-133. [PMID: 35087192 PMCID: PMC8830739 DOI: 10.1038/s41562-021-01244-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 10/29/2021] [Indexed: 11/15/2022]
Abstract
Mammals localize sounds using information from their two ears.
Localization in real-world conditions is challenging, as echoes provide
erroneous information, and noises mask parts of target sounds. To better
understand real-world localization we equipped a deep neural network with human
ears and trained it to localize sounds in a virtual environment. The resulting
model localized accurately in realistic conditions with noise and reverberation.
In simulated experiments, the model exhibited many features of human spatial
hearing: sensitivity to monaural spectral cues and interaural time and level
differences, integration across frequency, biases for sound onsets, and limits
on localization of concurrent sources. But when trained in unnatural
environments without either reverberation, noise, or natural sounds, these
performance characteristics deviated from those of humans. The results show how
biological hearing is adapted to the challenges of real-world environments and
illustrate how artificial neural networks can reveal the real-world constraints
that shape perception.
Collapse
|
7
|
Braren HS, Fels J. Towards Child-Appropriate Virtual Acoustic Environments: A Database of High-Resolution HRTF Measurements and 3D-Scans of Children. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 19:324. [PMID: 35010583 PMCID: PMC8750994 DOI: 10.3390/ijerph19010324] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 12/17/2021] [Accepted: 12/18/2021] [Indexed: 06/14/2023]
Abstract
Head-related transfer functions (HRTFs) play a significant role in modern acoustic experiment designs in the auralization of 3-dimensional virtual acoustic environments. This technique enables us to create close to real-life situations including room-acoustic effects, background noise and multiple sources in a controlled laboratory environment. While adult HRTF databases are widely available to the research community, datasets of children are not. To fill this gap, children aged 5-10 years old were recruited among 1st and 2nd year primary school children in Aachen, Germany. Their HRTFs were measured in the hemi-anechoic chamber with a 5-degree × 5-degree resolution. Special care was taken to reduce artifacts from motion during the measurements by means of fast measurement routines. To complement the HRTF measurements with the anthropometric data needed for individualization methods, a high-resolution 3D-scan of the head and upper torso of each participant was recorded. The HRTF measurement took around 3 min. The children's head movement during that time was larger compared to adult participants in comparable experiments but was generally kept within 5 degrees of rotary and 1 cm of translatory motion. Adult participants only exhibit this range of motion in longer duration measurements. A comparison of the HRTF measurements to the KEMAR artificial head shows that it is not representative of an average child HRTF. Difference can be seen in both the spectrum and in the interaural time delay (ITD) with differences of 70 μs on average and a maximum difference of 138 μs. For both spectrum and ITD, the KEMAR more closely resembles the 95th percentile of range of children's data. This warrants a closer look at using child specific HRTFs in the binaural presentation of virtual acoustic environments in the future.
Collapse
Affiliation(s)
- Hark Simon Braren
- Institute for Hearing Technology and Acoustics, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany;
| | | |
Collapse
|
8
|
Stawicki M, Majdak P, Başkent D. Ventriloquist Illusion Produced With Virtual Acoustic Spatial Cues and Asynchronous Audiovisual Stimuli in Both Young and Older Individuals. Multisens Res 2019; 32:745-770. [DOI: 10.1163/22134808-20191430] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Accepted: 09/03/2019] [Indexed: 11/19/2022]
Abstract
Abstract
Ventriloquist illusion, the change in perceived location of an auditory stimulus when a synchronously presented but spatially discordant visual stimulus is added, has been previously shown in young healthy populations to be a robust paradigm that mainly relies on automatic processes. Here, we propose ventriloquist illusion as a potential simple test to assess audiovisual (AV) integration in young and older individuals. We used a modified version of the illusion paradigm that was adaptive, nearly bias-free, relied on binaural stimulus representation using generic head-related transfer functions (HRTFs) instead of multiple loudspeakers, and tested with synchronous and asynchronous presentation of AV stimuli (both tone and speech). The minimum audible angle (MAA), the smallest perceptible difference in angle between two sound sources, was compared with or without the visual stimuli in young and older adults with no or minimal sensory deficits. The illusion effect, measured by means of MAAs implemented with HRTFs, was observed with both synchronous and asynchronous visual stimulus, but only with tone and not speech stimulus. The patterns were similar between young and older individuals, indicating the versatility of the modified ventriloquist illusion paradigm.
Collapse
Affiliation(s)
- Marnix Stawicki
- 1Department of Otorhinolaryngology / Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- 2Graduate School of Medical Sciences, Research School of Behavioral and Cognitive Neurosciences (BCN), University of Groningen, Groningen, The Netherlands
| | - Piotr Majdak
- 3Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Deniz Başkent
- 1Department of Otorhinolaryngology / Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- 2Graduate School of Medical Sciences, Research School of Behavioral and Cognitive Neurosciences (BCN), University of Groningen, Groningen, The Netherlands
| |
Collapse
|
9
|
Auditory motion perception emerges from successive sound localizations integrated over time. Sci Rep 2019; 9:16437. [PMID: 31712688 PMCID: PMC6848124 DOI: 10.1038/s41598-019-52742-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2019] [Accepted: 10/11/2019] [Indexed: 11/18/2022] Open
Abstract
Humans rely on auditory information to estimate the path of moving sound sources. But unlike in vision, the existence of motion-sensitive mechanisms in audition is still open to debate. Psychophysical studies indicate that auditory motion perception emerges from successive localization, but existing models fail to predict experimental results. However, these models do not account for any temporal integration. We propose a new model tracking motion using successive localization snapshots but integrated over time. This model is derived from psychophysical experiments on the upper limit for circular auditory motion perception (UL), defined as the speed above which humans no longer identify the direction of sounds spinning around them. Our model predicts ULs measured with different stimuli using solely static localization cues. The temporal integration blurs these localization cues rendering them unreliable at high speeds, which results in the UL. Our findings indicate that auditory motion perception does not require motion-sensitive mechanisms.
Collapse
|
10
|
Balachandar K, Carlile S. The monaural spectral cues identified by a reverse correlation analysis of free-field auditory localization data. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:29. [PMID: 31370620 DOI: 10.1121/1.5113577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 06/04/2019] [Indexed: 06/10/2023]
Abstract
The outer-ear's location-dependent pattern of spectral filtering generates cues used to determine a sound source's elevation as well as front-back location. The authors aim to identify these features using a reverse correlation analysis (RCA), combining free-field localization behaviour with the associated head-related transfer functions' (HRTFs) magnitude spectrum from a sample of 73 participants. Localization responses were collected before and immediately after introducing a pair of outer-ear inserts which modified the listener's HRTFs to varying extent. The RCA identified several different features responsible for eliciting localization responses. The efficacy of these was examined using two models of monaural localization. In general, the predicted performance was closely aligned with the free-field localization error for the bare-ear condition; however, both models tended to grossly over-estimate the localization error based on HRTFs modified by the outer-ear inserts. The RCA's feature selection notably had the effect of better aligning the predicted performance of both models to the actual localization performance. This suggests that the RCA revealed sufficient detail for both models to correctly predict localization performance and also limited the influence of filtered-out elements in the distorted HRTFs that contributed to the degraded accuracy of both models.
Collapse
Affiliation(s)
- Kapilesh Balachandar
- Auditory Neuroscience Laboratory, University of Sydney, New South Wales 2006, Australia
| | - Simon Carlile
- Auditory Neuroscience Laboratory, University of Sydney, New South Wales 2006, Australia
| |
Collapse
|
11
|
Ege R, Van Opstal AJ, Van Wanrooij MM. Perceived Target Range Shapes Human Sound-Localization Behavior. eNeuro 2019; 6:ENEURO.0111-18.2019. [PMID: 30963103 PMCID: PMC6451157 DOI: 10.1523/eneuro.0111-18.2019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 02/03/2019] [Accepted: 02/03/2019] [Indexed: 11/21/2022] Open
Abstract
The auditory system relies on binaural differences and spectral pinna cues to localize sounds in azimuth and elevation. However, the acoustic input can be unreliable, due to uncertainty about the environment, and neural noise. A possible strategy to reduce sound-location uncertainty is to integrate the sensory observations with sensorimotor information from previous experience, to infer where sounds are more likely to occur. We investigated whether and how human sound localization performance is affected by the spatial distribution of target sounds, and changes thereof. We tested three different open-loop paradigms, in which we varied the spatial range of sounds in different ways. For the narrowest ranges, target-response gains were highly idiosyncratic and deviated from an optimal gain predicted by error-minimization; in the horizontal plane the deviation typically consisted of a response overshoot. Moreover, participants adjusted their behavior by rapidly adapting their gain to the target range, both in elevation and in azimuth, yielding behavior closer to optimal for larger target ranges. Notably, gain changes occurred without any exogenous feedback about performance. We discuss how the findings can be explained by a sub-optimal model in which the motor-control system reduces its response error across trials to within an acceptable range, rather than strictly minimizing the error.
Collapse
Affiliation(s)
- Rachel Ege
- Department of Biophysics, Radboud University, Donders Institute for Brain, Cognition and Behaviour, 6525 AJ Nijmegen, The Netherlands
| | - A John Van Opstal
- Department of Biophysics, Radboud University, Donders Institute for Brain, Cognition and Behaviour, 6525 AJ Nijmegen, The Netherlands
| | - Marc M Van Wanrooij
- Department of Biophysics, Radboud University, Donders Institute for Brain, Cognition and Behaviour, 6525 AJ Nijmegen, The Netherlands
| |
Collapse
|
12
|
Rajendran VG, Gamper H. Spectral manipulation improves elevation perception with non-individualized head-related transfer functions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:EL222. [PMID: 31067970 DOI: 10.1121/1.5093641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Accepted: 02/21/2019] [Indexed: 06/09/2023]
Abstract
Spatially rendering sounds using head-related transfer functions (HRTFs) is an important part of creating immersive audio experiences for virtual reality applications. However, elevation perception remains challenging when generic, non-personalized HRTFs are used. This study investigated whether digital audio effects applied to a generic set of HRTFs could improve sound localization in the vertical plane. Several of the tested effects significantly improved elevation judgment, and trial-by-trial variability in spectral energy between 2 and 10 kHz correlated strongly with perceived elevation. Digital audio effects may therefore be a promising strategy to improve elevation perception where personalized HRTFs are not available.
Collapse
Affiliation(s)
- Vani G Rajendran
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong
| | - Hannes Gamper
- Audio and Acoustics Research Group, Microsoft Research, 14865 Northeast 36th Street, Redmond, Washington 98052,
| |
Collapse
|
13
|
Zonooz B, Arani E, Körding KP, Aalbers PATR, Celikel T, Van Opstal AJ. Spectral Weighting Underlies Perceived Sound Elevation. Sci Rep 2019; 9:1642. [PMID: 30733476 PMCID: PMC6367479 DOI: 10.1038/s41598-018-37537-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 12/07/2018] [Indexed: 11/09/2022] Open
Abstract
The brain estimates the two-dimensional direction of sounds from the pressure-induced displacements of the eardrums. Accurate localization along the horizontal plane (azimuth angle) is enabled by binaural difference cues in timing and intensity. Localization along the vertical plane (elevation angle), including frontal and rear directions, relies on spectral cues made possible by the elevation dependent filtering in the idiosyncratic pinna cavities. However, the problem of extracting elevation from the sensory input is ill-posed, since the spectrum results from a convolution between source spectrum and the particular head-related transfer function (HRTF) associated with the source elevation, which are both unknown to the system. It is not clear how the auditory system deals with this problem, or which implicit assumptions it makes about source spectra. By varying the spectral contrast of broadband sounds around the 6–9 kHz band, which falls within the human pinna’s most prominent elevation-related spectral notch, we here suggest that the auditory system performs a weighted spectral analysis across different frequency bands to estimate source elevation. We explain our results by a model, in which the auditory system weighs the different spectral bands, and compares the convolved weighted sensory spectrum with stored information about its own HRTFs, and spatial prior assumptions.
Collapse
Affiliation(s)
- Bahram Zonooz
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - Elahe Arani
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - Konrad P Körding
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA.,Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
| | - P A T Remco Aalbers
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - Tansu Celikel
- Neurophysiology Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - A John Van Opstal
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands.
| |
Collapse
|
14
|
Zonooz B, Arani E, Opstal AJV. Learning to localise weakly-informative sound spectra with and without feedback. Sci Rep 2018; 8:17933. [PMID: 30560940 PMCID: PMC6298951 DOI: 10.1038/s41598-018-36422-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 11/20/2018] [Indexed: 11/12/2022] Open
Abstract
How the human auditory system learns to map complex pinna-induced spectral-shape cues onto veridical estimates of sound-source elevation in the median plane is still unclear. Earlier studies demonstrated considerable sound-localisation plasticity after applying pinna moulds, and to altered vision. Several factors may contribute to auditory spatial learning, like visual or motor feedback, or updated priors. We here induced perceptual learning for sounds with degraded spectral content, having weak, but consistent, elevation-dependent cues, as demonstrated by low-gain stimulus-response relations. During training, we provided visual feedback for only six targets in the midsagittal plane, to which listeners gradually improved their response accuracy. Interestingly, listeners' performance also improved without visual feedback, albeit less strongly. Post-training results showed generalised improved response behaviour, also to non-trained locations and acoustic spectra, presented throughout the two-dimensional frontal hemifield. We argue that the auditory system learns to reweigh contributions from low-informative spectral bands to update its prior elevation estimates, and explain our results with a neuro-computational model.
Collapse
Affiliation(s)
- Bahram Zonooz
- Biophysics Department, Donders Center for Neuroscience, Radboud University, Heyendaalseweg 135, 6525, AJ, Nijmegen, The Netherlands
| | - Elahe Arani
- Biophysics Department, Donders Center for Neuroscience, Radboud University, Heyendaalseweg 135, 6525, AJ, Nijmegen, The Netherlands
| | - A John Van Opstal
- Biophysics Department, Donders Center for Neuroscience, Radboud University, Heyendaalseweg 135, 6525, AJ, Nijmegen, The Netherlands.
| |
Collapse
|
15
|
Abstract
Sensory representations are typically endowed with intrinsic noise, leading to variability and inaccuracies in perceptual responses. The Bayesian framework accounts for an optimal strategy to deal with sensory-motor uncertainty, by combining the noisy sensory input with prior information regarding the distribution of stimulus properties. The maximum-a-posteriori (MAP) estimate selects the perceptual response from the peak (mode) of the resulting posterior distribution that ensure optimal accuracy-precision trade-off when the underlying distributions are Gaussians (minimal mean-squared error, with minimum response variability). We tested this model on human eye- movement responses toward broadband sounds, masked by various levels of background noise, and for head movements to sounds with poor spectral content. We report that the response gain (accuracy) and variability (precision) of the elevation response components changed systematically with the signal-to-noise ratio of the target sound: gains were high for high SNRs and decreased for low SNRs. In contrast, the azimuth response components maintained high gains for all conditions, as predicted by maximum-likelihood estimation. However, we found that the elevation data did not follow the MAP prediction. Instead, results were better described by an alternative decision strategy, in which the response results from taking a random sample from the posterior in each trial. We discuss two potential implementations of a simple posterior sampling scheme in the auditory system that account for the results and argue that although the observed response strategies for azimuth and elevation are sub-optimal with respect to their variability, it allows the auditory system to actively explore the environment in the absence of adequate sensory evidence.
Collapse
|
16
|
Campos J, Ramkhalawansingh R, Pichora-Fuller MK. Hearing, self-motion perception, mobility, and aging. Hear Res 2018; 369:42-55. [DOI: 10.1016/j.heares.2018.03.025] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 02/20/2018] [Accepted: 03/29/2018] [Indexed: 11/30/2022]
|
17
|
Denk F, Ewert SD, Kollmeier B. Spectral directional cues captured by hearing device microphones in individual human ears. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2072. [PMID: 30404454 DOI: 10.1121/1.5056173] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Accepted: 09/11/2018] [Indexed: 06/08/2023]
Abstract
Spatial hearing abilities with hearing devices ultimately depend on how well acoustic directional cues are captured by the microphone(s) of the device. A comprehensive objective evaluation of monaural spectral directional cues captured at 9 microphone locations integrated in 5 hearing device styles is presented, utilizing a recent database of head-related transfer functions (HRTFs) that includes data from 16 human and 3 artificial ear pairs. Differences between HRTFs to the eardrum and hearing device microphones were assessed by descriptive analyses and quantitative metrics, and compared to differences between individual ears. Directional information exploited for vertical sound localization was evaluated by means of computational models. Directional information at microphone locations inside the pinna is significantly biased and qualitatively poorer compared to locations in the ear canal; behind-the-ear microphones capture almost no directional cues. These errors are expected to impair vertical sound localization, even if the new cues would be optimally mapped to locations. Differences between HRTFs to the eardrum and hearing device microphones are qualitatively different from between-subject differences and can be described as a partial destruction rather than an alteration of relevant cues, although spectral difference metrics produce similar results. Dummy heads do not fully reflect the results with individual subjects.
Collapse
Affiliation(s)
- Florian Denk
- Medizinische Physik and Cluster of Excellence "Hearing4all," University of Oldenburg, Küpkersweg 74, 26129 Oldenburg, Germany
| | - Stephan D Ewert
- Medizinische Physik and Cluster of Excellence "Hearing4all," University of Oldenburg, Küpkersweg 74, 26129 Oldenburg, Germany
| | - Birger Kollmeier
- Medizinische Physik and Cluster of Excellence "Hearing4all," University of Oldenburg, Küpkersweg 74, 26129 Oldenburg, Germany
| |
Collapse
|
18
|
The Encoding of Sound Source Elevation in the Human Auditory Cortex. J Neurosci 2018; 38:3252-3264. [PMID: 29507148 DOI: 10.1523/jneurosci.2530-17.2018] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Revised: 02/11/2018] [Accepted: 02/14/2018] [Indexed: 11/21/2022] Open
Abstract
Spatial hearing is a crucial capacity of the auditory system. While the encoding of horizontal sound direction has been extensively studied, very little is known about the representation of vertical sound direction in the auditory cortex. Using high-resolution fMRI, we measured voxelwise sound elevation tuning curves in human auditory cortex and show that sound elevation is represented by broad tuning functions preferring lower elevations as well as secondary narrow tuning functions preferring individual elevation directions. We changed the ear shape of participants (male and female) with silicone molds for several days. This manipulation reduced or abolished the ability to discriminate sound elevation and flattened cortical tuning curves. Tuning curves recovered their original shape as participants adapted to the modified ears and regained elevation perception over time. These findings suggest that the elevation tuning observed in low-level auditory cortex did not arise from the physical features of the stimuli but is contingent on experience with spectral cues and covaries with the change in perception. One explanation for this observation may be that the tuning in low-level auditory cortex underlies the subjective perception of sound elevation.SIGNIFICANCE STATEMENT This study addresses two fundamental questions about the brain representation of sensory stimuli: how the vertical spatial axis of auditory space is represented in the auditory cortex and whether low-level sensory cortex represents physical stimulus features or subjective perceptual attributes. Using high-resolution fMRI, we show that vertical sound direction is represented by broad tuning functions preferring lower elevations as well as secondary narrow tuning functions preferring individual elevation directions. In addition, we demonstrate that the shape of these tuning functions is contingent on experience with spectral cues and covaries with the change in perception, which may indicate that the tuning functions in low-level auditory cortex underlie the perceived elevation of a sound source.
Collapse
|
19
|
Spence C, Lee J, Van der Stoep N. Responding to sounds from unseen locations: crossmodal attentional orienting in response to sounds presented from the rear. Eur J Neurosci 2017; 51:1137-1150. [PMID: 28973789 DOI: 10.1111/ejn.13733] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 09/27/2017] [Accepted: 09/27/2017] [Indexed: 11/28/2022]
Abstract
To date, most of the research on spatial attention has focused on probing people's responses to stimuli presented in frontal space. That is, few researchers have attempted to assess what happens in the space that is currently unseen (essentially rear space). In a sense, then, 'out of sight' is, very much, 'out of mind'. In this review, we highlight what is presently known about the perception and processing of sensory stimuli (focusing on sounds) whose source is not currently visible. We briefly summarize known differences in the localizability of sounds presented from different locations in 3D space, and discuss the consequences for the crossmodal attentional and multisensory perceptual interactions taking place in various regions of space. The latest research now clearly shows that the kinds of crossmodal interactions that take place in rear space are very often different in kind from those that have been documented in frontal space. Developing a better understanding of how people respond to unseen sound sources in naturalistic environments by integrating findings emerging from multiple fields of research will likely lead to the design of better warning signals in the future. This review highlights the need for neuroscientists interested in spatial attention to spend more time researching what happens (in terms of the covert and overt crossmodal orienting of attention) in rear space.
Collapse
Affiliation(s)
- Charles Spence
- Crossmodal Research Laboratory, Department of Experimental Psychology, Oxford University, Oxford, OX1 3UD, UK
| | - Jae Lee
- Crossmodal Research Laboratory, Department of Experimental Psychology, Oxford University, Oxford, OX1 3UD, UK
| | - Nathan Van der Stoep
- Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
20
|
Andreopoulou A, Katz BFG. Identification of perceptually relevant methods of inter-aural time difference estimation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:588. [PMID: 28863557 DOI: 10.1121/1.4996457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The inter-aural time difference (ITD) is a fundamental cue for human sound localization. Over the past decades several methods have been proposed for its estimation from measured head-related impulse response (HRIR) data. Nevertheless, inter-method variations in ITD calculation have been found to exceed the known just noticeable differences (JNDs), leading to possible perceptible artifacts in virtual binaural auditory scenes, when personalized HRIRs are being used. In the absence of an objective means for validating ITD estimations, this paper examines which methods lead to the most perceptually relevant results. A subjective lateralization study compared objective ITDs to perceptually evaluated inter-aural pure delay offsets. Results clearly indicate the first-onset threshold detection method, using a low relative threshold of -30 dB, applied on 3 kHz low-pass filtered HRIRs as consistently the most perceptually relevant procedure across various metrics. Several alternative threshold values and methods based on the maximum or centroid of the inter-aural cross correlation of similarly filtered HRIR or HRIR envelopes also provided reasonable results. On the contrary, phase-based methods employing the integrated relative group delay or auditory model were not found to perform as well.
Collapse
Affiliation(s)
- Areti Andreopoulou
- Audio and Acoustic Group, LIMSI, CNRS, Université Paris-Saclay, Orsay, France
| | - Brian F G Katz
- Sorbonne Universités, UPMC Universite Paris 06, CNRS, Institut d'Alembert, Paris, France
| |
Collapse
|
21
|
Joubaud T, Zimpfer V, Garcia A, Langrenne C. Sound localization models as evaluation tools for tactical communication and protective systems. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:2637. [PMID: 28464634 DOI: 10.1121/1.4979693] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Tactical Communication and Protective Systems (TCAPS) are hearing protection devices that sufficiently protect the listener's ears from hazardous sounds and preserve speech intelligibility. However, previous studies demonstrated that TCAPS still deteriorate the listener's situational awareness, in particular, the ability to locate sound sources. On the horizontal plane, this is mainly explained by the degradation of the acoustical cues normally preventing the listener from making front-back confusions. As part of TCAPS development and assessment, a method predicting the TCAPS-induced degradation of the sound localization capability based on electroacoustic measurements would be more suitable than time-consuming behavioral experiments. In this context, the present paper investigates two methods based on Head-Related Transfer Functions (HRTFs): a template-matching model and a three-layer neural network. They are optimized to fit human sound source identification performance in open ear condition. The methods are applied to HRTFs measured with six TCAPS, providing identification probabilities. They are compared with the results of a behavioral experiment, conducted with the same protectors, and which ranks the TCAPS by type. The neural network predicts realistic performances with earplugs, but overestimates errors with earmuffs. The template-matching model predicts human performance well, except for two particular TCAPS.
Collapse
Affiliation(s)
- Thomas Joubaud
- Acoustics and Protection of the Soldier, French-German Research Institute of Saint-Louis, 5 rue du Général Cassagnou, BP 70034, 68301 Saint-Louis, France
| | - Véronique Zimpfer
- Acoustics and Protection of the Soldier, French-German Research Institute of Saint-Louis, 5 rue du Général Cassagnou, BP 70034, 68301 Saint-Louis, France
| | - Alexandre Garcia
- Laboratoire de Mécanique des Structures et des Systèmes Couplés, Conservatoire National des Arts et Métiers, 292 rue Saint-Martin, 75141 Paris Cedex 03, France
| | - Christophe Langrenne
- Laboratoire de Mécanique des Structures et des Systèmes Couplés, Conservatoire National des Arts et Métiers, 292 rue Saint-Martin, 75141 Paris Cedex 03, France
| |
Collapse
|
22
|
Trapeau R, Aubrais V, Schönwiesner M. Fast and persistent adaptation to new spectral cues for sound localization suggests a many-to-one mapping mechanism. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:879. [PMID: 27586720 DOI: 10.1121/1.4960568] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The adult human auditory system can adapt to changes in spectral cues for sound localization. This plasticity was demonstrated by changing the shape of the pinna with earmolds. Previous results indicate that participants regain localization accuracy after several weeks of adaptation and that the adapted state is retained for at least one week without earmolds. No aftereffect was observed after mold removal, but any aftereffect may be too short to be observed when responses are averaged over many trials. This work investigated the lack of aftereffect by analyzing single-trial responses and modifying visual, auditory, and tactile information during the localization task. Results showed that participants localized accurately immediately after mold removal, even at the first stimulus presentation. Knowledge of the stimulus spectrum, tactile information about the absence of the earmolds, and visual feedback were not necessary to localize accurately after adaptation. Part of the adaptation persisted for one month without molds. The results are consistent with the hypothesis of a many-to-one mapping of the spectral cues, in which several spectral profiles are simultaneously associated with one sound location. Additionally, participants with acoustically more informative spectral cues localized sounds more accurately, and larger acoustical disturbances by the molds reduced adaptation success.
Collapse
Affiliation(s)
- Régis Trapeau
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| | - Valérie Aubrais
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| | - Marc Schönwiesner
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| |
Collapse
|
23
|
Boothalingam S, Macpherson E, Allan C, Allen P, Purcell D. Localization-in-noise and binaural medial olivocochlear functioning in children and young adults. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:247-262. [PMID: 26827021 DOI: 10.1121/1.4939708] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Children as young as 5 yr old localize sounds as accurately as adults in quiet in the frontal hemifield. However, children's ability to localize in noise and in the front/back (F/B) dimension are scantily studied. To address this, the first part of this study investigated localization-in-noise ability of children vs young adults in two maskers: broadband noise (BBN) and speech-babble (SB) at three signal-to-noise ratios: -12, -6, and 0 dB. In the second part, relationship between binaural medial olivocochlear system (MOC) function and localization-in-noise was investigated. In both studies, 21 children and 21 young adults participated. Results indicate, while children are able to differentiate sounds arriving in the F/B dimension on par with adults in quiet and in BBN, larger differences were found for SB. Accuracy of children's localization in noise (for both maskers) in the lateral plane was also poorer than adults'. Significant differences in binaural MOC interaction (mBIC; the difference between the sum of two monaural- and binaural-MOC strength) between adults and children were also found. For reasons which are not clear, adult F/B localization in BBN correlates better with mBIC while children's F/B localization in SB correlated better with binaural MOC strength.
Collapse
Affiliation(s)
- Sriram Boothalingam
- National Centre for Audiology, Western University, London, Ontario N6G 1H1, Canada
| | - Ewan Macpherson
- National Centre for Audiology, Western University, London, Ontario N6G 1H1, Canada
| | - Chris Allan
- National Centre for Audiology, Western University, London, Ontario N6G 1H1, Canada
| | - Prudence Allen
- National Centre for Audiology, Western University, London, Ontario N6G 1H1, Canada
| | - David Purcell
- National Centre for Audiology, Western University, London, Ontario N6G 1H1, Canada
| |
Collapse
|
24
|
Ziegelwanger H, Majdak P, Kreuzer W. Numerical calculation of listener-specific head-related transfer functions and sound localization: Microphone model and mesh discretization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:208-22. [PMID: 26233020 PMCID: PMC4582438 DOI: 10.1121/1.4922518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Head-related transfer functions (HRTFs) can be numerically calculated by applying the boundary element method on the geometry of a listener's head and pinnae. The calculation results are defined by geometrical, numerical, and acoustical parameters like the microphone used in acoustic measurements. The scope of this study was to estimate requirements on the size and position of the microphone model and on the discretization of the boundary geometry as triangular polygon mesh for accurate sound localization. The evaluation involved the analysis of localization errors predicted by a sagittal-plane localization model, the comparison of equivalent head radii estimated by a time-of-arrival model, and the analysis of actual localization errors obtained in a sound-localization experiment. While the average edge length (AEL) of the mesh had a negligible effect on localization performance in the lateral dimension, the localization performance in sagittal planes, however, degraded for larger AELs with the geometrical error as dominant factor. A microphone position at an arbitrary position at the entrance of the ear canal, a microphone size of 1 mm radius, and a mesh with 1 mm AEL yielded a localization performance similar to or better than observed with acoustically measured HRTFs.
Collapse
Affiliation(s)
- Harald Ziegelwanger
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria
| | - Wolfgang Kreuzer
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria
| |
Collapse
|
25
|
Trapeau R, Schönwiesner M. Adaptation to shifted interaural time differences changes encoding of sound location in human auditory cortex. Neuroimage 2015; 118:26-38. [PMID: 26054873 DOI: 10.1016/j.neuroimage.2015.06.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Revised: 05/06/2015] [Accepted: 06/02/2015] [Indexed: 11/29/2022] Open
Abstract
The auditory system infers the location of sound sources from the processing of different acoustic cues. These cues change during development and when assistive hearing devices are worn. Previous studies have found behavioral recalibration to modified localization cues in human adults, but very little is known about the neural correlates and mechanisms of this plasticity. We equipped participants with digital devices, worn in the ear canal that allowed us to delay sound input to one ear, and thus modify interaural time differences, a major cue for horizontal sound localization. Participants wore the digital earplugs continuously for nine days while engaged in day-to-day activities. Daily psychoacoustical testing showed rapid recalibration to the manipulation and confirmed that adults can adapt to shifted interaural time differences in their daily multisensory environment. High-resolution functional MRI scans performed before and after recalibration showed that recalibration was accompanied by changes in hemispheric lateralization of auditory cortex activity. These changes corresponded to a shift in spatial coding of sound direction comparable to the observed behavioral recalibration. Fitting the imaging results with a model of auditory spatial processing also revealed small shifts in voxel-wise spatial tuning within each hemisphere.
Collapse
Affiliation(s)
- Régis Trapeau
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Montreal , QC, Canada; Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montreal, QC, Canada
| | - Marc Schönwiesner
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Montreal , QC, Canada; Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montreal, QC, Canada; Department of Neurology and Neurosurgery, Faculty of Medicine, McGill University, Montreal, QC, Canada.
| |
Collapse
|
26
|
Abstract
The auditory system derives locations of sound sources from spatial cues provided by the interaction of sound with the head and external ears. Those cues are analyzed in specific brainstem pathways and then integrated as cortical representation of locations. The principal cues for horizontal localization are interaural time differences (ITDs) and interaural differences in sound level (ILDs). Vertical and front/back localization rely on spectral-shape cues derived from direction-dependent filtering properties of the external ears. The likely first sites of analysis of these cues are the medial superior olive (MSO) for ITDs, lateral superior olive (LSO) for ILDs, and dorsal cochlear nucleus (DCN) for spectral-shape cues. Localization in distance is much less accurate than that in horizontal and vertical dimensions, and interpretation of the basic cues is influenced by additional factors, including acoustics of the surroundings and familiarity of source spectra and levels. Listeners are quite sensitive to sound motion, but it remains unclear whether that reflects specific motion detection mechanisms or simply detection of changes in static location. Intact auditory cortex is essential for normal sound localization. Cortical representation of sound locations is highly distributed, with no evidence for point-to-point topography. Spatial representation is strictly contralateral in laboratory animals that have been studied, whereas humans show a prominent right-hemisphere dominance.
Collapse
Affiliation(s)
- John C Middlebrooks
- Departments of Otolaryngology, Neurobiology and Behavior, Cognitive Sciences, and Biomedical Engineering, University of California at Irvine, Irvine, CA, USA.
| |
Collapse
|
27
|
Romigh GD, Simpson BD. Do you hear where I hear?: isolating the individualized sound localization cues. Front Neurosci 2014; 8:370. [PMID: 25520607 PMCID: PMC4249451 DOI: 10.3389/fnins.2014.00370] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 10/28/2014] [Indexed: 11/13/2022] Open
Abstract
It is widely acknowledged that individualized head-related transfer function (HRTF) measurements are needed to adequately capture all of the 3D spatial hearing cues. However, many perceptual studies have shown that localization accuracy in the lateral dimension is only minimally decreased by the use of non-individualized head-related transfer functions. This evidence supports the idea that the individualized components of an HRTF could be isolated from those that are more general in nature. In the present study we decomposed the HRTF at each location into average, lateral and intraconic spectral components, along with an ITD in an effort to isolate the sound localization cues that are responsible for the inter-individual differences in localization performance. HRTFs for a given listener were then reconstructed systematically with components that were both individualized and non-individualized in nature, and the effect of each modification was analyzed via a virtual localization test where brief 250 ms noise bursts were rendered with the modified HRTFs. Results indicate that the cues important for individualization of HRTFs are contained almost exclusively in the intraconic portion of the HRTF spectra and localization is only minimally affected by introducing non-individualized cues into the other HRTF components. These results provide new insights into what specific inter-individual differences in head-related acoustical features are most relevant to sound localization, and provide a framework for how future human-machine interfaces might be more effectively generalized and/or individualized.
Collapse
|
28
|
Baumgartner R, Majdak P, Laback B. Modeling sound-source localization in sagittal planes for human listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:791-802. [PMID: 25096113 PMCID: PMC4582445 DOI: 10.1121/1.4887447] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Monaural spectral features are important for human sound-source localization in sagittal planes, including front-back discrimination and elevation perception. These directional features result from the acoustic filtering of incoming sounds by the listener's morphology and are described by listener-specific head-related transfer functions (HRTFs). This article proposes a probabilistic, functional model of sagittal-plane localization that is based on human listeners' HRTFs. The model approximates spectral auditory processing, accounts for acoustic and non-acoustic listener specificity, allows for predictions beyond the median plane, and directly predicts psychoacoustic measures of localization performance. The predictive power of the listener-specific modeling approach was verified under various experimental conditions: The model predicted effects on localization performance of band limitation, spectral warping, non-individualized HRTFs, spectral resolution, spectral ripples, and high-frequency attenuation in speech. The functionalities of vital model components were evaluated and discussed in detail. Positive spectral gradient extraction, sensorimotor mapping, and binaural weighting of monaural spatial information were addressed in particular. Potential applications of the model include predictions of psychophysical effects, for instance, in the context of virtual acoustics or hearing assistive devices.
Collapse
|
29
|
Thakkar T, Goupell MJ. Internalized elevation perception of simple stimuli in cochlear-implant and normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:841-852. [PMID: 25096117 PMCID: PMC4144177 DOI: 10.1121/1.4884770] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2014] [Revised: 06/02/2014] [Accepted: 06/10/2014] [Indexed: 06/03/2023]
Abstract
In normal-hearing (NH) listeners, elevation perception is produced by the spectral cues imposed by the pinna, head, and torso. Elevation perception in cochlear-implant (CI) listeners appears to be non-existent; this may be a result of poorly encoded spectral cues. In this study, an analog of elevation perception was investigated by having 15 CI and 8 NH listeners report the intracranial location of spectrally simple signals (single-electrode or bandlimited acoustic stimuli, respectively) in both horizontal and vertical dimensions. Thirteen CI listeners and all of the NH listeners showed an association between place of stimulation (i.e., stimulus frequency) and perceived elevation, generally responding with higher elevations for more basal stimulation. This association persisted in the presence of a randomized temporal pitch, suggesting that listeners were not associating pitch with elevation. These data provide evidence that CI listeners might perceive changes in elevation if they were presented stimuli with sufficiently salient elevation cues.
Collapse
Affiliation(s)
- Tanvi Thakkar
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742
| |
Collapse
|
30
|
Durin V, Carlile S, Guillon P, Best V, Kalluri S. Acoustic analysis of the directional information captured by five different hearing aid styles. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:818-828. [PMID: 25096115 DOI: 10.1121/1.4883372] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This study compared the head-related transfer functions (HRTFs) recorded from the bare ear of a mannequin for 393 spatial locations and for five different hearing aid styles: Invisible-in-the-canal (IIC), completely-in-the-canal (CIC), in-the-canal (ITC), in-the-ear (ITE), and behind-the-ear (BTE). The spectral distortions of each style compared to the bare ear were described qualitatively in terms of the gain and frequency characteristics of the prominent spectral notch and two peaks in the HRTFs. Two quantitative measures of the differences between the HRTF sets and a measure of the dissimilarity of the HRTFs within each set were also computed. In general, the IIC style was most similar and the BTE most dissimilar to the bare ear recordings. The relative similarities among the CIC, ITC, and ITE styles depended on the metric employed. The within-style spectral dissimilarities were comparable for the bare ear, IIC, CIC, and ITC with increasing ambiguity for the ITE and BTE styles. When the analysis bandwidth was limited to 8 kHz, the HRTFs within each set became much more similar.
Collapse
Affiliation(s)
- Virginie Durin
- VAST Audio Pty Ltd., 4 Cornwallis Street, Eveleigh, New South Wales 2015, Australia
| | - Simon Carlile
- Bosh Institute and School of Medical Sciences, Anderson Stuart Building (F13), University of Sydney, New South Wales 2006, Australia
| | - Pierre Guillon
- Computing and Audio Research Laboratory, School of Electrical and Information Engineering, University of Sydney, New South Wales 2006, Australia
| | - Virginia Best
- Bosh Institute and School of Medical Sciences, Anderson Stuart Building (F13), University of Sydney, New South Wales 2006, Australia
| | - Sridhar Kalluri
- Starkey Hearing Research Center, 2150 Shattuck Avenue, Suite 408, Berkeley, California 94704-1345
| |
Collapse
|
31
|
Monson BB, Hunter EJ, Lotto AJ, Story BH. The perceptual significance of high-frequency energy in the human voice. Front Psychol 2014; 5:587. [PMID: 24982643 PMCID: PMC4059169 DOI: 10.3389/fpsyg.2014.00587] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2014] [Accepted: 05/26/2014] [Indexed: 11/25/2022] Open
Abstract
While human vocalizations generate acoustical energy at frequencies up to (and beyond) 20 kHz, the energy at frequencies above about 5 kHz has traditionally been neglected in speech perception research. The intent of this paper is to review (1) the historical reasons for this research trend and (2) the work that continues to elucidate the perceptual significance of high-frequency energy (HFE) in speech and singing. The historical and physical factors reveal that, while HFE was believed to be unnecessary and/or impractical for applications of interest, it was never shown to be perceptually insignificant. Rather, the main causes for focus on low-frequency energy appear to be because the low-frequency portion of the speech spectrum was seen to be sufficient (from a perceptual standpoint), or the difficulty of HFE research was too great to be justifiable (from a technological standpoint). The advancement of technology continues to overcome concerns stemming from the latter reason. Likewise, advances in our understanding of the perceptual effects of HFE now cast doubt on the first cause. Emerging evidence indicates that HFE plays a more significant role than previously believed, and should thus be considered in speech and voice perception research, especially in research involving children and the hearing impaired.
Collapse
Affiliation(s)
- Brian B. Monson
- Department of Pediatric Newborn Medicine, Brigham and Women’s Hospital, Harvard Medical SchoolBoston, MA, USA
- National Center for Voice and Speech, University of UtahSalt Lake City, UT, USA
| | - Eric J. Hunter
- National Center for Voice and Speech, University of UtahSalt Lake City, UT, USA
- Department of Communicative Sciences and Disorders, Michigan State UniversityEast Lansing, MI, USA
| | - Andrew J. Lotto
- Speech, Language, and Hearing Sciences, University of ArizonaTucson, AZ, USA
| | - Brad H. Story
- Speech, Language, and Hearing Sciences, University of ArizonaTucson, AZ, USA
| |
Collapse
|
32
|
Katz BFG, Noisternig M. A comparative study of Interaural Time Delay estimation methods. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:3530-3540. [PMID: 24907816 DOI: 10.1121/1.4875714] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
The Interaural Time Delay (ITD) is an important binaural cue for sound source localization. Calculations of ITD values are obtained either from measured time domain Head-Related Impulse Responses (HRIRs) or from their frequency transform Head-Related Transfer Functions (HRTFs). Numerous methods exist in current literature, based on a variety of definitions and assumptions of the nature of the ITD as an acoustic cue. This work presents a thorough comparative study of the degree of variability between some of the most common methods for calculating the ITD from measured data. Thirty-two different calculations or variations are compared for positions on the horizontal plane for the HRTF measured on both a KEMAR mannequin and a rigid sphere. Specifically, the spatial variations of the methods are investigated. Included is a discussion of the primary potential causes of these differences, such as the existence of multiple peaks in the HRIR of the contra-lateral ear for azimuths near the inter-aural axis due to multipath propagation and head/pinnae shadowing.
Collapse
Affiliation(s)
- Brian F G Katz
- Audio Acoustique, LIMSI-CNRS, Université Paris-Sud, 91403 Orsay, France
| | - Markus Noisternig
- UMR STMS IRCAM-CNRS-UPMC, 1 place Igor-Stravinsky, 75004 Paris, France
| |
Collapse
|
33
|
Scharine AA, Binseel MS, Mermagen T, Letowski TR. Sound localisation ability of soldiers wearing infantry ACH and PASGT helmets. ERGONOMICS 2014; 57:1222-1243. [PMID: 24840132 DOI: 10.1080/00140139.2014.917202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Helmets provide soldiers with ballistic and fragmentation protection but impair auditory spatial processing. Missed auditory information can be fatal for a soldier; therefore, helmet design requires compromise between protection and optimal acoustics. Twelve soldiers localised two sound signals presented from six azimuth angles and three levels of elevation presented at two intensity levels and with three background noises. Each participant completed the task while wearing no helmet and with two U.S. Army infantry helmets - the Personnel Armor System for Ground Troops (PASGT) helmet and the Advanced Combat Helmet (ACH). Results showed a significant effect of helmet type on the size of both azimuth and elevation error. The effects of level, background noise, azimuth and elevation were found to be significant. There was no effect of sound signal type. As hypothesised, localisation accuracy was greatest when soldiers did not wear helmet, followed by the ACH. Performance was worst with the PASGT helmet.
Collapse
|
34
|
Majdak P, Baumgartner R, Laback B. Acoustic and non-acoustic factors in modeling listener-specific performance of sagittal-plane sound localization. Front Psychol 2014; 5:319. [PMID: 24795672 PMCID: PMC4006033 DOI: 10.3389/fpsyg.2014.00319] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 03/27/2014] [Indexed: 11/13/2022] Open
Abstract
The ability of sound-source localization in sagittal planes (along the top-down and front-back dimension) varies considerably across listeners. The directional acoustic spectral features, described by head-related transfer functions (HRTFs), also vary considerably across listeners, a consequence of the listener-specific shape of the ears. It is not clear whether the differences in localization ability result from differences in the encoding of directional information provided by the HRTFs, i.e., an acoustic factor, or from differences in auditory processing of those cues (e.g., spectral-shape sensitivity), i.e., non-acoustic factors. We addressed this issue by analyzing the listener-specific localization ability in terms of localization performance. Directional responses to spatially distributed broadband stimuli from 18 listeners were used. A model of sagittal-plane localization was fit individually for each listener by considering the actual localization performance, the listener-specific HRTFs representing the acoustic factor, and an uncertainty parameter representing the non-acoustic factors. The model was configured to simulate the condition of complete calibration of the listener to the tested HRTFs. Listener-specifically calibrated model predictions yielded correlations of, on average, 0.93 with the actual localization performance. Then, the model parameters representing the acoustic and non-acoustic factors were systematically permuted across the listener group. While the permutation of HRTFs affected the localization performance, the permutation of listener-specific uncertainty had a substantially larger impact. Our findings suggest that across-listener variability in sagittal-plane localization ability is only marginally determined by the acoustic factor, i.e., the quality of directional cues found in typical human HRTFs. Rather, the non-acoustic factors, supposed to represent the listeners' efficiency in processing directional cues, appear to be important.
Collapse
Affiliation(s)
- Piotr Majdak
- Psychoacoustics and Experimental Audiology, Acoustics Research Institute, Austrian Academy of Sciences Wien, Austria
| | - Robert Baumgartner
- Psychoacoustics and Experimental Audiology, Acoustics Research Institute, Austrian Academy of Sciences Wien, Austria
| | - Bernhard Laback
- Psychoacoustics and Experimental Audiology, Acoustics Research Institute, Austrian Academy of Sciences Wien, Austria
| |
Collapse
|
35
|
Reijniers J, Vanderelst D, Jin C, Carlile S, Peremans H. An ideal-observer model of human sound localization. BIOLOGICAL CYBERNETICS 2014; 108:169-181. [PMID: 24570350 DOI: 10.1007/s00422-014-0588-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Accepted: 01/24/2014] [Indexed: 06/03/2023]
Abstract
In recent years, a great deal of research within the field of sound localization has been aimed at finding the acoustic cues that human listeners use to localize sounds and understanding the mechanisms by which they process these cues. In this paper, we propose a complementary approach by constructing an ideal-observer model, by which we mean a model that performs optimal information processing within a Bayesian context. The model considers all available spatial information contained within the acoustic signals encoded by each ear. Parameters for the optimal Bayesian model are determined based on psychoacoustic discrimination experiments on interaural time difference and sound intensity. Without regard as to how the human auditory system actually processes information, we examine the best possible localization performance that could be achieved based only on analysis of the input information, given the constraints of the normal auditory system. We show that the model performance is generally in good agreement with the actual human localization performance, as assessed in a meta-analysis of many localization experiments (Best et al. in Principles and applications of spatial hearing, pp 14-23. World Scientific Publishing, Singapore, 2011). We believe this approach can shed new light on the optimality (or otherwise) of human sound localization, especially with regard to the level of uncertainty in the input information. Moreover, the proposed model allows one to study the relative importance of various (combinations of) acoustic cues for spatial localization and enables a prediction of which cues are most informative and therefore likely to be used by humans in various circumstances.
Collapse
Affiliation(s)
- J Reijniers
- Biology Department, University of Antwerp, Antwerp, Belgium,
| | | | | | | | | |
Collapse
|
36
|
Macpherson EA, Sabin AT. Vertical-plane sound localization with distorted spectral cues. Hear Res 2013; 306:76-92. [PMID: 24076423 DOI: 10.1016/j.heares.2013.09.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Revised: 09/11/2013] [Accepted: 09/17/2013] [Indexed: 10/26/2022]
Abstract
For human listeners, the primary cues for localization in the vertical plane are provided by the direction-dependent filtering of the pinnae, head, and upper body. Vertical-plane localization generally is accurate for broadband sounds, but when such sounds are presented at near-threshold levels or at high levels with short durations (<20 ms), the apparent location is biased toward the horizontal plane (i.e., elevation gain <1). We tested the hypothesis that these effects result in part from distorted peripheral representations of sound spectra. Human listeners indicated the apparent position of 100-ms, 50-60 dB SPL, wideband noise-burst targets by orienting their heads. The targets were synthesized in virtual auditory space and presented over headphones. Faithfully synthesized targets were interleaved with targets for which the directional transfer function spectral notches were filled in, peaks were leveled off, or the spectral contrast of the entire profile was reduced or expanded. As notches were filled in progressively or peaks leveled progressively, elevation gain decreased in a graded manner similar to that observed as sensation level is reduced below 30 dB or, for brief sounds, increased above 45 dB. As spectral contrast was reduced, gain dropped only at the most extreme reduction (25% of normal). Spectral contrast expansion had little effect. The results are consistent with the hypothesis that loss of representation of spectral features contributes to reduced elevation gain at low and high sound levels. The results also suggest that perceived location depends on a correlation-like spectral matching process that is sensitive to the relative, rather than absolute, across-frequency shape of the spectral profile.
Collapse
Affiliation(s)
- Ewan A Macpherson
- Kresge Hearing Research Institute, University of Michigan Medical School, 1150 W. Medical Center Drive, Ann Arbor, MI 48109-5616, USA; National Centre for Audiology, Western University, 1201 Western Road, London, Ontario, Canada N6G 1H1.
| | | |
Collapse
|
37
|
Majdak P, Walder T, Laback B. Effect of long-term training on sound localization performance with spectrally warped and band-limited head-related transfer functions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:2148-2159. [PMID: 23967945 DOI: 10.1121/1.4816543] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Sound localization in the sagittal planes, including the ability to distinguish front from back, relies on spectral features caused by the filtering effects of the head, pinna, and torso. It is assumed that important spatial cues are encoded in the frequency range between 4 and 16 kHz. In this study, in a double-blind design and using audio-visual training covering the full 3-D space, normal-hearing listeners were trained 2 h per day over three weeks to localize sounds which were either band limited up to 8.5 kHz or spectrally warped from the range between 2.8 and 16 kHz to the range between 2.8 and 8.5 kHz. The training effect for the warped condition exceeded that for procedural task learning, suggesting a stable auditory recalibration due to the training. After the training, performance with band-limited sounds was better than that with warped ones. The results show that training can improve sound localization in cases where spectral cues have been reduced by band-limiting or remapped by warping. This suggests that hearing-impaired listeners, who have limited access to high frequencies, might also improve their localization ability when provided with spectrally warped or band-limited sounds and adequately trained on sound localization.
Collapse
Affiliation(s)
- Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Vienna, Austria.
| | | | | |
Collapse
|
38
|
Honda A, Shibata H, Hidaka S, Gyoba J, Iwaya Y, Suzuki Y. Effects of head movement and proprioceptive feedback in training of sound localization. Iperception 2013; 4:253-64. [PMID: 24349686 PMCID: PMC3859569 DOI: 10.1068/i0522] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Revised: 04/19/2013] [Indexed: 11/29/2022] Open
Abstract
We investigated the effects of listeners' head movements and proprioceptive feedback during sound localization practice on the subsequent accuracy of sound localization performance. The effects were examined under both restricted and unrestricted head movement conditions in the practice stage. In both cases, the participants were divided into two groups: a feedback group performed a sound localization drill with accurate proprioceptive feedback; a control group conducted it without the feedback. Results showed that (1) sound localization practice, while allowing for free head movement, led to improvement in sound localization performance and decreased actual angular errors along the horizontal plane, and that (2) proprioceptive feedback during practice decreased actual angular errors in the vertical plane. Our findings suggest that unrestricted head movement and proprioceptive feedback during sound localization training enhance perceptual motor learning by enabling listeners to use variable auditory cues and proprioceptive information.
Collapse
Affiliation(s)
- Akio Honda
- Research Institute of Electrical Communication, Tohoku University, 2-1-1, Katahira, Aoba-ku, Sendai, Miyagi 980-8577, Japan Currently at: Department of Welfare Psychology, Tohoku Fukushi University, 1-8-1, Kunimi, Aoba-ku, Sendai, Miyagi 981-8522, Japan; e-mail:
| | - Hiroshi Shibata
- Research Institute of Electrical Communication, Tohoku University, 2-1-1, Katahira, Aoba-ku, Sendai, Miyagi 980-8577, Japan; Department of Psychology, Graduate School of Arts and Letters, Tohoku University, 27-1, Kawauchi, Aoba-ku, Sendai, Miyagi 980-8576, Japan Currently at: Faculty of Medical Science and Welfare, Tohoku Bunka Gakuen University, 6-45-1, Kunimi, Aoba-ku, Sendai, Miyagi 981-0943, Japan; e-mail:
| | - Souta Hidaka
- Department of Psychology, Rikkyo University, 1-2-26, Kitano, Niiza-shi, Saitama 352-8558, Japan; e-mail:
| | - Jiro Gyoba
- Department of Psychology, Graduate School of Arts and Letters, Tohoku University, 27-1, Kawauchi, Aoba-ku, Sendai, Miyagi 980-8576, Japan; e-mail:
| | - Yukio Iwaya
- Research Institute of Electrical Communication, Tohoku University, 2-1-1, Katahira, Aoba-ku, Sendai, Miyagi 980-8577, Japan Currently at: Faculty of Engineering, Tohoku Gakuin University, 1-13-1, Chuo, Tagajo, Miyagi 985-8537, Japan; e-mail:
| | - Yôiti Suzuki
- Research Institute of Electrical Communication, Tohoku University, 2-1-1, Katahira, Aoba-ku, Sendai, Miyagi 980-8577, Japan; e-mail:
| |
Collapse
|
39
|
Majdak P, Masiero B, Fels J. Sound localization in individualized and non-individualized crosstalk cancellation systems. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:2055-2068. [PMID: 23556576 DOI: 10.1121/1.4792355] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The sound-source localization provided by a crosstalk cancellation (CTC) system depends on the head-related transfer functions (HRTFs) used for the CTC filter calculation. In this study, the horizontal- and sagittal-plane localization performance was investigated in humans listening to individualized matched, individualized but mismatched, and non-individualized CTC systems. The systems were simulated via headphones in a binaural virtual environment with two virtual loudspeakers spatialized in front of the listener. The individualized mismatched system was based on two different sets of listener-individual HRTFs. Both sets provided similar binaural localization performance in terms of quadrant, polar, and lateral errors. The individualized matched systems provided performance similar to that from the binaural listening. For the individualized mismatched systems, the performance deteriorated, and for the non-individualized mismatched systems (based on HRTFs from other listeners), the performance deteriorated even more. The direction-dependent analysis showed that mismatch and lack of individualization yielded a substantially degraded performance for targets placed outside of the loudspeaker span and behind the listeners, showing relevance of individualized CTC systems for those targets. Further, channel separation was calculated for different frequency ranges and is discussed in the light of its use as a predictor for the localization performance provided by a CTC system.
Collapse
Affiliation(s)
- Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, A-1040 Vienna, Austria.
| | | | | |
Collapse
|
40
|
Allen K, Alais D, Carlile S. A Collection of Pseudo-Words to Study Multi-Talker Speech Intelligibility without Shifts of Spatial Attention. Front Psychol 2012; 3:49. [PMID: 22435061 PMCID: PMC3304086 DOI: 10.3389/fpsyg.2012.00049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2011] [Accepted: 02/08/2012] [Indexed: 11/13/2022] Open
Abstract
A new collection of pseudo-words was recorded from a single female speaker of American English for use in multi-talker speech intelligibility research. The pseudo-words (known as the KARG collection) consist of three groups of single syllable pseudo-words varying only by the initial phoneme. The KARG method allows speech intelligibility to be studied free of the influence of shifts of spatial attention from one loudspeaker location to another in multi-talker contexts. To achieve this, all KARG pseudo-words share the same concluding rimes, with only the first phoneme serving as a distinguishing identifier. This ensures that listeners are unable to correctly identify the target pseudo-word without hearing the initial phoneme. As the duration of all the initial phonemes are brief, much shorter than the time required to spatially shift attention, the KARG method assesses speech intelligibility without the confound of shifting spatial attention. The KARG collection is available free for research purposes.
Collapse
Affiliation(s)
- Kachina Allen
- Auditory, Brain and Cognitive Development Laboratory, McGill University Montreal, QC, Canada
| | | | | |
Collapse
|
41
|
Abstract
The auditory system represents sound-source directions initially in head-centered coordinates. To program eye-head gaze shifts to sounds, the orientation of eyes and head should be incorporated to specify the target relative to the eyes. Here we test (1) whether this transformation involves a stage in which sounds are represented in a world- or a head-centered reference frame, and (2) whether acoustic spatial updating occurs at a topographically organized motor level representing gaze shifts, or within the tonotopically organized auditory system. Human listeners generated head-unrestrained gaze shifts from a large range of initial eye and head positions toward brief broadband sound bursts, and to tones at different center frequencies, presented in the midsagittal plane. Tones were heard at a fixed illusory elevation, regardless of their actual location, that depended in an idiosyncratic way on initial head and eye position, as well as on the tone's frequency. Gaze shifts to broadband sounds were accurate, fully incorporating initial eye and head positions. The results support the hypothesis that the auditory system represents sounds in a supramodal reference frame, and that signals about eye and head orientation are incorporated at a tonotopic stage.
Collapse
|
42
|
Kulkarni PN, Pandey PC, Jangamashetti DS. Binaural dichotic presentation to reduce the effects of spectral masking in moderate bilateral sensorineural hearing loss. Int J Audiol 2011; 51:334-44. [PMID: 22201526 DOI: 10.3109/14992027.2011.642012] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE The objective of the study was to evaluate the effectiveness of binaural dichotic presentation using comb filters with complementary magnitude responses, based on fixed bandwidth and auditory critical bandwidth, in improving speech perception by persons with moderate bilateral sensorineural hearing loss, and to assess its effect on localization of the sound source. DESIGN AND STUDY SAMPLE Listening tests involving consonant recognition and source direction identification were conducted on six normal-hearing subjects under simulated hearing loss and on eleven subjects with moderate bilateral sensorineural loss in quiet. RESULTS The tests on normal-hearing subjects showed higher recognition scores and smaller response times for the comb filters based on the auditory critical bandwidth. The tests using these comb filters on the hearing-impaired subjects resulted in an increase of 14%-31% (mean: 22%) in recognition scores and a significant decrease in response times, with no significant effect on the identification of the direction of broadband sound sources. CONCLUSIONS The results show that dichotic presentation may be useful for speech processing in binaural hearing aids.
Collapse
Affiliation(s)
- Pandurangarao N Kulkarni
- Department of Electrical Engineering, Indian Institute of Technology Bombay, Powai, Maharashtra, India
| | | | | |
Collapse
|
43
|
Schreuder M, Rost T, Tangermann M. Listen, You are Writing! Speeding up Online Spelling with a Dynamic Auditory BCI. Front Neurosci 2011; 5:112. [PMID: 22016719 PMCID: PMC3192990 DOI: 10.3389/fnins.2011.00112] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2011] [Accepted: 09/01/2011] [Indexed: 12/14/2022] Open
Abstract
Representing an intuitive spelling interface for brain-computer interfaces (BCI) in the auditory domain is not straight-forward. In consequence, all existing approaches based on event-related potentials (ERP) rely at least partially on a visual representation of the interface. This online study introduces an auditory spelling interface that eliminates the necessity for such a visualization. In up to two sessions, a group of healthy subjects (N = 21) was asked to use a text entry application, utilizing the spatial cues of the AMUSE paradigm (Auditory Multi-class Spatial ERP). The speller relies on the auditory sense both for stimulation and the core feedback. Without prior BCI experience, 76% of the participants were able to write a full sentence during the first session. By exploiting the advantages of a newly introduced dynamic stopping method, a maximum writing speed of 1.41 char/min (7.55 bits/min) could be reached during the second session (average: 0.94 char/min, 5.26 bits/min). For the first time, the presented work shows that an auditory BCI can reach performances similar to state-of-the-art visual BCIs based on covert attention. These results represent an important step toward a purely auditory BCI.
Collapse
Affiliation(s)
- Martijn Schreuder
- Machine Learning Laboratory, Berlin Institute of Technology Berlin, Germany
| | | | | |
Collapse
|
44
|
Allen K, Alais D, Shinn-Cunningham B, Carlile S. Masker location uncertainty reveals evidence for suppression of maskers in two-talker contexts. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:2043-2053. [PMID: 21973359 PMCID: PMC3206908 DOI: 10.1121/1.3631666] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2010] [Revised: 08/05/2011] [Accepted: 08/08/2011] [Indexed: 05/31/2023]
Abstract
In many natural settings, spatial release from masking aids speech intelligibility, especially when there are competing talkers. This paper describes a series of three experiments that investigate the role of prior knowledge of masker location on phoneme identification and spatial release from masking. In contrast to previous work, these experiments use initial stop-consonant identification as a test of target intelligibility to ensure that listeners had little time to switch the focus of spatial attention during the task. The first experiment shows that target phoneme identification was worse when a masker played from an unexpected location (increasing the consonant identification threshold by 2.6 dB) compared to when an energetically very similar and symmetrically located masker came from an expected location. In the second and third experiments, target phoneme identification was worse (increasing target threshold levels by 2.0 and 2.6 dB, respectively) when the target was played unexpectedly on the side from which the masker was expected compared to when the target came from an unexpected, symmetrical location in the hemifield opposite the expected location of the masker. These results support the idea that listeners modulate spatial attention by both focusing resources on the expected target location and withdrawing attentional resources from expected locations of interfering sources.
Collapse
Affiliation(s)
- Kachina Allen
- School of Medical Sciences, University of Sydney, New South Wales, Australia 2106.
| | | | | | | |
Collapse
|
45
|
Abstract
OBJECTIVE To test localization of sound sources in horizontal and vertical dimensions in cochlear-implant (CI) listeners using clinical bilateral CI systems. DESIGN Five bilateral CI subjects listened via their clinical speech processors to noises filtered with subject-specific, behind-the-ear microphones and head-related transfer functions. Subjects were immersed in a visual virtual environment presented via a head-mounted display. Subjects used a manual pointer to respond to the perceived sound location and received visual response feedback via the head-mounted display during the tests. The target positions were randomly distributed in two-dimensional space over an azimuth range of 0° to 360° and over an elevation range of -30° to +80°. In experiment 1, the signal level was roved in the range of ±2.5 dB from trial to trial. In experiment 2, the signal level was roved in the range of ±5 dB. RESULTS CI subjects were generally worse at sound localization than normal-hearing listeners tested in a previous study, in both the horizontal and vertical dimensions. In the horizontal plane, subjects could determine the correct side and locate the target within the side at better than chance performance. In the vertical plane, with a smaller level-roving range, subjects could determine the correct hemifield at better than chance performance but could not locate the target within the correct hemifield. The target angle and response angle were correlated as expected. The response angle and signal level range were also correlated, raising concerns that subjects were using only level cues for the task. With a larger level-roving range, the number of front-back confusions increased. The correlation between the target and response angles decreased, whereas the correlation between the level and response angle did not change, which is an indication that the subjects were relying heavily on level cues. CONCLUSIONS For the horizontal plane, the results are in agreement with previous CI studies performed in the horizontal plane with a comparable range of targets. For the vertical plane, CI listeners could discriminate front from back at better than chance performance; however, there are strong indications that the broadband level, not the spectral profile, was used as the primary localization cue. This study indicates the necessity of new CI processing strategies that encode spectral localization cues.
Collapse
|
46
|
Dobreva MS, O'Neill WE, Paige GD. Influence of aging on human sound localization. J Neurophysiol 2011; 105:2471-86. [PMID: 21368004 DOI: 10.1152/jn.00951.2010] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Errors in sound localization, associated with age-related changes in peripheral and central auditory function, can pose threats to self and others in a commonly encountered environment such as a busy traffic intersection. This study aimed to quantify the accuracy and precision (repeatability) of free-field human sound localization as a function of advancing age. Head-fixed young, middle-aged, and elderly listeners localized band-passed targets using visually guided manual laser pointing in a darkened room. Targets were presented in the frontal field by a robotically controlled loudspeaker assembly hidden behind a screen. Broadband targets (0.1-20 kHz) activated all auditory spatial channels, whereas low-pass and high-pass targets selectively isolated interaural time and intensity difference cues (ITDs and IIDs) for azimuth and high-frequency spectral cues for elevation. In addition, to assess the upper frequency limit of ITD utilization across age groups more thoroughly, narrowband targets were presented at 250-Hz intervals from 250 Hz up to ∼ 2 kHz. Young subjects generally showed horizontal overestimation (overshoot) and vertical underestimation (undershoot) of auditory target location, and this effect varied with frequency band. Accuracy and/or precision worsened in older individuals for broadband, high-pass, and low-pass targets, reflective of peripheral but also central auditory aging. In addition, compared with young adults, middle-aged, and elderly listeners showed pronounced horizontal localization deficiencies (imprecision) for narrowband targets within 1,250-1,575 Hz, congruent with age-related central decline in auditory temporal processing. Findings underscore the distinct neural processing of the auditory spatial cues in sound localization and their selective deterioration with advancing age.
Collapse
Affiliation(s)
- Marina S Dobreva
- Department of Neurobiology and Anatomy, University of Rochester School of Medicine and Dentistry, 601 Elmwood Ave., Rochester, NY 14642-8603, USA
| | | | | |
Collapse
|
47
|
Best V, Kalluri S, McLachlan S, Valentine S, Edwards B, Carlile S. A comparison of CIC and BTE hearing aids for three-dimensional localization of speech. Int J Audiol 2011; 49:723-32. [PMID: 20515424 DOI: 10.3109/14992027.2010.484827] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Three-dimensional sound localization of speech in anechoic space was examined for eleven listeners with sensorineural hearing loss. The listeners were fitted bilaterally with CIC and BTE hearing aids having similar bandwidth capabilities. The goal was to determine whether differences in microphone placement for these two styles (CICs at the ear canal entrance; BTEs above the pinna) would influence the availability of pinna-related spectral cues and hence localization performance. While lateral and polar angle localization was unaffected by the hearing aid style, the rate of front-back reversals was lower with CICs. This pattern persisted after listeners accommodated to each set of aids for a six week period, although the overall rate of reversals declined. Performance on all measures in all conditions was considerably poorer than in a control group of listeners with normal hearing.
Collapse
Affiliation(s)
- Virginia Best
- School of Medical Sciences, University of Sydney, Australia
| | | | | | | | | | | |
Collapse
|
48
|
Van den Bogaert T, Carette E, Wouters J. Sound source localization using hearing aids with microphones placed behind-the-ear, in-the-canal, and in-the-pinna. Int J Audiol 2011; 50:164-76. [PMID: 21208034 DOI: 10.3109/14992027.2010.537376] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE The effect of different commercial hearing aids on the ability to resolve front-back confusions and on sound localization in the frontal horizontal and vertical plane was studied. DESIGN Commercial hearing aids with a microphone placed in-the-ear-canal (ITC), behind-the-ear (BTE), and in-the-pinna (ITP) were evaluated in the frontal and full horizontal plane, and in the frontal vertical plane. STUDY SAMPLE A group of 13 hearing-impaired subjects evaluated the hearing aids. Nine normal-hearing listeners were used as a reference group. RESULTS AND CONCLUSIONS Differences in sound localization in the front-back dimension were found for different hearing aids. A large inter-subject variability was found during the front-back and elevation experiments. With ITP or ITC microphones, almost all natural spectral information was preserved. One of the BTE hearing aids, which is equipped with a directional microphone configuration, generated a sufficient amount of spectral cues to allow front-back discrimination. No significant effect of hearing aids on elevation performance in the frontal vertical plane was observed. Hearing-impaired subjects reached the same performance with and without the different hearing aids. In the unaided condition, a frequency-specific audibility correction was applied. Some of the hearing-impaired listeners reached normal hearing performance with this correction.
Collapse
|
49
|
So RHY, Ngan B, Horner A, Braasch J, Blauert J, Leung KL. Toward orthogonal non-individualised head-related transfer functions for forward and backward directional sound: cluster analysis and an experimental study. ERGONOMICS 2010; 53:767-781. [PMID: 20496243 DOI: 10.1080/00140131003675117] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Individualised head-related transfer functions (HRTFs) have been shown to accurately simulate forward and backward directional sounds. This study explores directional simulation for non-individualised HRTFs by determining orthogonal HRTFs for listeners to choose between. Using spectral features previously shown to aid forward-backward differentiation, 196 non-individualised HRTFs were clustered into six orthogonal groups and the centre HRTF of each group was selected as representative. An experiment with 15 listeners was conducted to evaluate the benefits of choosing between six centre-front and six centre-back directional sounds rather than the single front/back sounds produced by MIT-KEMAR HRTFs. Sound localisation error was significantly reduced by 22% and 65% of listeners reduced their front-back confusion rates. The significant reduction was maintained when the number of HRTFs was reduced from six to five. This represents a preliminary success in bridging the gap between individual and non-individual HRTFs for applications such as spatial surround sound systems. STATEMENT OF RELEVANCE: Due to different pinna shapes, directional sound stimuli generated by non-individualised HRTFs suffer from serious front-back confusion. The reported work demonstrates a way to reduce front-back confusion for centre-back sounds generated from non-individualised HRTFs.
Collapse
Affiliation(s)
- R H Y So
- Department of Industrial Engineering and Logistics Management, Hong Kong University of Science and Technology, Hong Kong, PRC.
| | | | | | | | | | | |
Collapse
|
50
|
3-D localization of virtual sound sources: effects of visual environment, pointing method, and training. Atten Percept Psychophys 2010; 72:454-69. [PMID: 20139459 DOI: 10.3758/app.72.2.454] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The ability to localize sound sources in three-dimensional space was tested in humans. In Experiment 1, naive subjects listened to noises filtered with subject-specific head-related transfer functions. The tested conditions included the pointing method (head or manual pointing) and the visual environment (VE; darkness or virtual VE). The localization performance was not significantly different between the pointing methods. The virtual VE significantly improved the horizontal precision and reduced the number of front-back confusions. These results show the benefit of using a virtual VE in sound localization tasks. In Experiment 2, subjects were provided with sound localization training. Over the course of training, the performance improved for all subjects, with the largest improvements occurring during the first 400 trials. The improvements beyond the first 400 trials were smaller. After the training, there was still no significant effect of pointing method, showing that the choice of either head- or manual-pointing method plays a minor role in sound localization performance. The results of Experiment 2 reinforce the importance of perceptual training for at least 400 trials in sound localization studies.
Collapse
|