1
|
Lladó P, Hyvärinen P, Pulkki V. The impact of head-worn devices in an auditory-aided visual search task. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2460-2469. [PMID: 38578178 DOI: 10.1121/10.0025542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 03/21/2024] [Indexed: 04/06/2024]
Abstract
Head-worn devices (HWDs) interfere with the natural transmission of sound from the source to the ears of the listener, worsening their localization abilities. The localization errors introduced by HWDs have been mostly studied in static scenarios, but these errors are reduced if head movements are allowed. We studied the effect of 12 HWDs on an auditory-cued visual search task, where head movements were not restricted. In this task, a visual target had to be identified in a three-dimensional space with the help of an acoustic stimulus emitted from the same location as the visual target. The results showed an increase in the search time caused by the HWDs. Acoustic measurements of a dummy head wearing the studied HWDs showed evidence of impaired localization cues, which were used to estimate the perceived localization errors using computational auditory models of static localization. These models were able to explain the search-time differences in the perceptual task, showing the influence of quadrant errors in the auditory-aided visual search task. These results indicate that HWDs have an impact on sound-source localization even when head movements are possible, which may compromise the safety and the quality of experience of the wearer.
Collapse
Affiliation(s)
- Pedro Lladó
- Acoustics Lab, Department of Information and Communication Engineering, Aalto University, Espoo, 00076, Finland
| | - Petteri Hyvärinen
- Acoustics Lab, Department of Information and Communication Engineering, Aalto University, Espoo, 00076, Finland
| | - Ville Pulkki
- Acoustics Lab, Department of Information and Communication Engineering, Aalto University, Espoo, 00076, Finland
| |
Collapse
|
2
|
Yost WA. Randomizing spectral cues used to resolve front-back reversals in sound-source localization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:661-670. [PMID: 37540095 PMCID: PMC10404140 DOI: 10.1121/10.0020563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 07/05/2023] [Accepted: 07/18/2023] [Indexed: 08/05/2023]
Abstract
Front-back reversals (FBRs) in sound-source localization tasks due to cone-of-confusion errors on the azimuth plane occur with some regularity, and their occurrence is listener-dependent. There are fewer FBRs for wideband, high-frequency sounds than for low-frequency sounds presumably because the sources of low-frequency sounds are localized on the basis of interaural differences (interaural time and level differences), which can lead to ambiguous responses. Spectral cues can aid in determining sound-source locations for wideband, high-frequency sounds, and such spectral cues do not lead to ambiguous responses. However, to what extent spectral features might aid sound-source localization is still not known. This paper explores conditions in which the spectral profile of two-octave wide noise bands, whose sources were localized on the azimuth plane, were randomly varied. The experiment demonstrated that such spectral profile randomization increased FBRs for high-frequency noise bands, presumably because whatever spectral features are used for sound-source localization were no longer as useful for resolving FBRs, and listeners relied on interaural differences for sound-source localization, which led to response ambiguities. Additionally, head rotation decreased FBRs in all cases, even when FBRs increased due to spectral profile randomization. In all cases, the occurrence of FBRs was listener-dependent.
Collapse
Affiliation(s)
- William A Yost
- Spatial Hearing Lab, College of Health Solutions, Arizona State University, Tempe, Arizona 85004, USA
| |
Collapse
|
3
|
McLachlan G, Majdak P, Reijniers J, Mihocic M, Peremans H. Dynamic spectral cues do not affect human sound localization during small head movements. Front Neurosci 2023; 17:1027827. [PMID: 36816108 PMCID: PMC9936143 DOI: 10.3389/fnins.2023.1027827] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 01/16/2023] [Indexed: 02/05/2023] Open
Abstract
Natural listening involves a constant deployment of small head movement. Spatial listening is facilitated by head movements, especially when resolving front-back confusions, an otherwise common issue during sound localization under head-still conditions. The present study investigated which acoustic cues are utilized by human listeners to localize sounds using small head movements (below ±10° around the center). Seven normal-hearing subjects participated in a sound localization experiment in a virtual reality environment. Four acoustic cue stimulus conditions were presented (full spectrum, flattened spectrum, frozen spectrum, free-field) under three movement conditions (no movement, head rotations over the yaw axis and over the pitch axis). Localization performance was assessed using three metrics: lateral and polar precision error and front-back confusion rate. Analysis through mixed-effects models showed that even small yaw rotations provide a remarkable decrease in front-back confusion rate, whereas pitch rotations did not show much of an effect. Furthermore, MSS cues improved localization performance even in the presence of dITD cues. However, performance was similar between stimuli with and without dMSS cues. This indicates that human listeners utilize the MSS cues before the head moves, but do not rely on dMSS cues to localize sounds when utilizing small head movements.
Collapse
Affiliation(s)
- Glen McLachlan
- Department of Engineering Management, University of Antwerp, Antwerp, Belgium,*Correspondence: Glen McLachlan ✉
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Jonas Reijniers
- Department of Engineering Management, University of Antwerp, Antwerp, Belgium
| | - Michael Mihocic
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Herbert Peremans
- Department of Engineering Management, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
4
|
Stevenson-Hoare JO, Freeman TCA, Culling JF. The pinna enhances angular discrimination in the frontal hemifield. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:2140. [PMID: 36319254 DOI: 10.1121/10.0014599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 09/19/2022] [Indexed: 06/16/2023]
Abstract
Human sound localization in the horizontal dimension is thought to be dominated by binaural cues, particularly interaural time delays, because monaural localization in this dimension is relatively poor. Remaining ambiguities of front versus back and up versus down are distinguished by high-frequency spectral cues generated by the pinna. The experiments in this study show that this account is incomplete. Using binaural listening throughout, the pinna substantially enhanced horizontal discrimination in the frontal hemifield, making discrimination in front better than discrimination at the rear, particularly for directions away from the median plane. Eliminating acoustic effects of the pinna by acoustically bypassing them or low-pass filtering abolished the advantage at the front without affecting the rear. Acoustic measurements revealed a pinna-induced spectral prominence that shifts smoothly in frequency as sounds move from 0° to 90° azimuth. The improved performance is discussed in terms of the monaural and binaural changes induced by the pinna.
Collapse
Affiliation(s)
- Joshua O Stevenson-Hoare
- School of Psychology, Cardiff University, Tower Building, Park Place, Cardiff CF10 3AT, United Kingdom
| | - Tom C A Freeman
- School of Psychology, Cardiff University, Tower Building, Park Place, Cardiff CF10 3AT, United Kingdom
| | - John F Culling
- School of Psychology, Cardiff University, Tower Building, Park Place, Cardiff CF10 3AT, United Kingdom
| |
Collapse
|
5
|
Riedel S, Zotter F. Surrounding line sources optimally reproduce diffuse envelopment at off-center listening positions. JASA EXPRESS LETTERS 2022; 2:094404. [PMID: 36182342 DOI: 10.1121/10.0014168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Listener envelopment has previously been studied in the fields of room acoustics and multichannel sound reproduction. However, the potentially detrimental effect of a directional imbalance remains uninvestigated. This paper presents a listening experiment under anechoic conditions using a ring of 24 loudspeakers. Participants rated perceived envelopment for various loudspeaker subsets fed by incoherent noise signals. Off-center listening positions were simulated for different acoustic source models: -6 dB (point source), -3 dB (line source), or 0 dB attenuation for every doubling of distance. Only the line-source model preserved envelopment off-center, providing a low interaural level difference and a low interaural coherence as perceptual cues.
Collapse
Affiliation(s)
- Stefan Riedel
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts, Graz, 8010, Austria ,
| | - Franz Zotter
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts, Graz, 8010, Austria ,
| |
Collapse
|
6
|
Osses Vecchi A, Varnet L, Carney LH, Dau T, Bruce IC, Verhulst S, Majdak P. A comparative study of eight human auditory models of monaural processing. ACTA ACUSTICA. EUROPEAN ACOUSTICS ASSOCIATION 2022; 6:17. [PMID: 36325461 PMCID: PMC9625898 DOI: 10.1051/aacus/2022008] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
A number of auditory models have been developed using diverging approaches, either physiological or perceptual, but they share comparable stages of signal processing, as they are inspired by the same constitutive parts of the auditory system. We compare eight monaural models that are openly accessible in the Auditory Modelling Toolbox. We discuss the considerations required to make the model outputs comparable to each other, as well as the results for the following model processing stages or their equivalents: Outer and middle ear, cochlear filter bank, inner hair cell, auditory nerve synapse, cochlear nucleus, and inferior colliculus. The discussion includes a list of recommendations for future applications of auditory models.
Collapse
Affiliation(s)
- Alejandro Osses Vecchi
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Laurel H. Carney
- Departments of Biomedical Engineering and Neuroscience, University of Rochester, Rochester, NY 14642, USA
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Ian C. Bruce
- Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Sarah Verhulst
- Hearing Technology group, WAVES, Department of Information Technology, Ghent University, 9000 Ghent, Belgium
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, 1040 Vienna, Austria
| |
Collapse
|
7
|
Braren HS, Fels J. Towards Child-Appropriate Virtual Acoustic Environments: A Database of High-Resolution HRTF Measurements and 3D-Scans of Children. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 19:324. [PMID: 35010583 PMCID: PMC8750994 DOI: 10.3390/ijerph19010324] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 12/17/2021] [Accepted: 12/18/2021] [Indexed: 06/14/2023]
Abstract
Head-related transfer functions (HRTFs) play a significant role in modern acoustic experiment designs in the auralization of 3-dimensional virtual acoustic environments. This technique enables us to create close to real-life situations including room-acoustic effects, background noise and multiple sources in a controlled laboratory environment. While adult HRTF databases are widely available to the research community, datasets of children are not. To fill this gap, children aged 5-10 years old were recruited among 1st and 2nd year primary school children in Aachen, Germany. Their HRTFs were measured in the hemi-anechoic chamber with a 5-degree × 5-degree resolution. Special care was taken to reduce artifacts from motion during the measurements by means of fast measurement routines. To complement the HRTF measurements with the anthropometric data needed for individualization methods, a high-resolution 3D-scan of the head and upper torso of each participant was recorded. The HRTF measurement took around 3 min. The children's head movement during that time was larger compared to adult participants in comparable experiments but was generally kept within 5 degrees of rotary and 1 cm of translatory motion. Adult participants only exhibit this range of motion in longer duration measurements. A comparison of the HRTF measurements to the KEMAR artificial head shows that it is not representative of an average child HRTF. Difference can be seen in both the spectrum and in the interaural time delay (ITD) with differences of 70 μs on average and a maximum difference of 138 μs. For both spectrum and ITD, the KEMAR more closely resembles the 95th percentile of range of children's data. This warrants a closer look at using child specific HRTFs in the binaural presentation of virtual acoustic environments in the future.
Collapse
Affiliation(s)
- Hark Simon Braren
- Institute for Hearing Technology and Acoustics, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany;
| | | |
Collapse
|
8
|
Head-Related Transfer Functions for Dynamic Listeners in Virtual Reality. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11146646] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In dynamic virtual reality, visual cues and motor actions aid auditory perception. With multimodal integration and auditory adaptation effects, generic head-related transfer functions (HRTFs) may yield no significant disadvantage to individual HRTFs regarding accurate auditory perception. This study compares two individual HRTF sets against a generic HRTF set by way of objective analysis and two subjective experiments. First, auditory-model-based predictions examine the objective deviations in localization cues between the sets. Next, the HRTFs are compared in a static subjective (N=8) localization experiment. Finally, the localization accuracy, timbre, and overall quality of the HRTF sets are evaluated subjectively (N=12) in a six-degrees-of-freedom audio-visual virtual environment. The results show statistically significant objective deviations between the sets, but no perceived localization or overall quality differences in the dynamic virtual reality.
Collapse
|
9
|
Stitt P, Katz BFG. Sensitivity analysis of pinna morphology on head-related transfer functions simulated via a parametric pinna model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2559. [PMID: 33940891 DOI: 10.1121/10.0004128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 03/15/2021] [Indexed: 06/12/2023]
Abstract
The head-related transfer function (HRTF) defines the acoustic path from a source to the two ears of a listener in a manner that is highly dependent on direction. This directional dependence arises from the highly individual morphology of the pinna, which results in complex reflections and resonances. While this notion is generally accepted, there has been little research on the importance of different structural elements of the pinna on the HRTF. A parametric three-dimensional ear model was used to investigate the changes in shape of the pinna in a systematic manner with a view to determining important contributing morphological parameters that can be used for HRTF individualization. HRTFs were simulated using the boundary element method. The analysis comprised objective comparisons between the directional transfer function and diffuse field component. The mean spectral distortion was used for global evaluation of HRTF similarity across all simulated positions. A perceptual localization model was used to determine correspondences between perceptual cues and objective parameters. A reasonable match was found between the modelled perceptual results and the mean spectral distortion. Modifications to the shape of the concha were found to have an important impact on the HRTF, as did those in proximity to the triangular fossa. Furthermore, parameters that control the relief of the pinna were found to be at least as important as more frequently cited side-facing parameters, highlighting limitations in previous morphological/HRTF studies.
Collapse
Affiliation(s)
- Peter Stitt
- Sorbonne Université, CNRS, UMR 7190, Institut Jean Le Rond d'Alembert, Lutheries-Acoustique-Musique, Paris, France
| | - Brian F G Katz
- Sorbonne Université, CNRS, UMR 7190, Institut Jean Le Rond d'Alembert, Lutheries-Acoustique-Musique, Paris, France
| |
Collapse
|
10
|
Pausch F, Fels J. Localization Performance in a Binaural Real-Time Auralization System Extended to Research Hearing Aids. Trends Hear 2020; 24:2331216520908704. [PMID: 32324491 PMCID: PMC7198834 DOI: 10.1177/2331216520908704] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Auralization systems for auditory research should ideally be validated by perceptual experiments, as well as objective measures. This study employed perceptual tests to evaluate a recently proposed binaural real-time auralization system for hearing aid (HA) users. The dynamic localization of real sound sources was compared with that of virtualized ones, reproduced binaurally over headphones, loudspeakers with crosstalk cancellation (CTC) filters, research HAs, or combined via loudspeakers with CTC filters and research HAs under free-field conditions. System-inherent properties affecting localization cues were identified and their effects on overall horizontal localization, reversal rates, and angular error metrics were assessed. The general localization performance in combined reproduction was found to fall between what was measured for loudspeakers with CTC filters and research HAs alone. Reproduction via research HAs alone resulted in the highest reversal rates and angular errors. While combined reproduction helped decrease the reversal rates, no significant effect was observed on the angular error metrics. However, combined reproduction resulted in the same overall horizontal source localization performance as measured for real sound sources, while improving localization compared with reproduction over research HAs alone. Collectively, the results with respect to combined reproduction can be considered a performance indicator for future experiments involving HA users.
Collapse
Affiliation(s)
- Florian Pausch
- Teaching and Research Area of Medical Acoustics, Institute of Technical Acoustics, RWTH Aachen University
| | - Janina Fels
- Teaching and Research Area of Medical Acoustics, Institute of Technical Acoustics, RWTH Aachen University
| |
Collapse
|
11
|
Jenny C, Reuter C. Usability of Individualized Head-Related Transfer Functions in Virtual Reality: Empirical Study With Perceptual Attributes in Sagittal Plane Sound Localization. JMIR Serious Games 2020; 8:e17576. [PMID: 32897232 PMCID: PMC7509635 DOI: 10.2196/17576] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 05/07/2020] [Accepted: 07/26/2020] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND In order to present virtual sound sources via headphones spatially, head-related transfer functions (HRTFs) can be applied to audio signals. In this so-called binaural virtual acoustics, the spatial perception may be degraded if the HRTFs deviate from the true HRTFs of the listener. OBJECTIVE In this study, participants wearing virtual reality (VR) headsets performed a listening test on the 3D audio perception of virtual audiovisual scenes, thus enabling us to investigate the necessity and influence of the individualization of HRTFs. Two hypotheses were investigated: first, general HRTFs lead to limitations of 3D audio perception in VR and second, the localization model for stationary localization errors is transferable to nonindividualized HRTFs in more complex environments such as VR. METHODS For the evaluation, 39 subjects rated individualized and nonindividualized HRTFs in an audiovisual virtual scene on the basis of 5 perceptual qualities: localizability, front-back position, externalization, tone color, and realism. The VR listening experiment consisted of 2 tests: in the first test, subjects evaluated their own and the general HRTF from the Massachusetts Institute of Technology Knowles Electronics Manikin for Acoustic Research database and in the second test, their own and 2 other nonindividualized HRTFs from the Acoustics Research Institute HRTF database. For the experiment, 2 subject-specific, nonindividualized HRTFs with a minimal and maximal localization error deviation were selected according to the localization model in sagittal planes. RESULTS With the Wilcoxon signed-rank test for the first test, analysis of variance for the second test, and a sample size of 78, the results were significant in all perceptual qualities, except for the front-back position between own and minimal deviant nonindividualized HRTF (P=.06). CONCLUSIONS Both hypotheses have been accepted. Sounds filtered by individualized HRTFs are considered easier to localize, easier to externalize, more natural in timbre, and thus more realistic compared to sounds filtered by nonindividualized HRTFs.
Collapse
Affiliation(s)
- Claudia Jenny
- Musicological Department, University of Vienna, Vienna, Austria
| | | |
Collapse
|
12
|
Deng Y, Choi I, Shinn-Cunningham B, Baumgartner R. Impoverished auditory cues limit engagement of brain networks controlling spatial selective attention. Neuroimage 2019; 202:116151. [PMID: 31493531 DOI: 10.1016/j.neuroimage.2019.116151] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 08/02/2019] [Accepted: 08/31/2019] [Indexed: 12/30/2022] Open
Abstract
Spatial selective attention enables listeners to process a signal of interest in natural settings. However, most past studies on auditory spatial attention used impoverished spatial cues: presenting competing sounds to different ears, using only interaural differences in time (ITDs) and/or intensity (IIDs), or using non-individualized head-related transfer functions (HRTFs). Here we tested the hypothesis that impoverished spatial cues impair spatial auditory attention by only weakly engaging relevant cortical networks. Eighteen normal-hearing listeners reported the content of one of two competing syllable streams simulated at roughly +30° and -30° azimuth. The competing streams consisted of syllables from two different-sex talkers. Spatialization was based on natural spatial cues (individualized HRTFs), individualized IIDs, or generic ITDs. We measured behavioral performance as well as electroencephalographic markers of selective attention. Behaviorally, subjects recalled target streams most accurately with natural cues. Neurally, spatial attention significantly modulated early evoked sensory response magnitudes only for natural cues, not in conditions using only ITDs or IIDs. Consistent with this, parietal oscillatory power in the alpha band (8-14 Hz; associated with filtering out distracting events from unattended directions) showed significantly less attentional modulation with isolated spatial cues than with natural cues. Our findings support the hypothesis that spatial selective attention networks are only partially engaged by impoverished spatial auditory cues. These results not only suggest that studies using unnatural spatial cues underestimate the neural effects of spatial auditory attention, they also illustrate the importance of preserving natural spatial cues in assistive listening devices to support robust attentional control.
Collapse
Affiliation(s)
- Yuqi Deng
- Biomedical Engineering, Boston University, Boston, MA, 02215, USA
| | - Inyong Choi
- Communication Sciences & Disorders, University of Iowa, Iowa City, IA, 52242, USA
| | - Barbara Shinn-Cunningham
- Biomedical Engineering, Boston University, Boston, MA, 02215, USA; Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Robert Baumgartner
- Biomedical Engineering, Boston University, Boston, MA, 02215, USA; Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria.
| |
Collapse
|
13
|
Balachandar K, Carlile S. The monaural spectral cues identified by a reverse correlation analysis of free-field auditory localization data. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:29. [PMID: 31370620 DOI: 10.1121/1.5113577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 06/04/2019] [Indexed: 06/10/2023]
Abstract
The outer-ear's location-dependent pattern of spectral filtering generates cues used to determine a sound source's elevation as well as front-back location. The authors aim to identify these features using a reverse correlation analysis (RCA), combining free-field localization behaviour with the associated head-related transfer functions' (HRTFs) magnitude spectrum from a sample of 73 participants. Localization responses were collected before and immediately after introducing a pair of outer-ear inserts which modified the listener's HRTFs to varying extent. The RCA identified several different features responsible for eliciting localization responses. The efficacy of these was examined using two models of monaural localization. In general, the predicted performance was closely aligned with the free-field localization error for the bare-ear condition; however, both models tended to grossly over-estimate the localization error based on HRTFs modified by the outer-ear inserts. The RCA's feature selection notably had the effect of better aligning the predicted performance of both models to the actual localization performance. This suggests that the RCA revealed sufficient detail for both models to correctly predict localization performance and also limited the influence of filtered-out elements in the distorted HRTFs that contributed to the degraded accuracy of both models.
Collapse
Affiliation(s)
- Kapilesh Balachandar
- Auditory Neuroscience Laboratory, University of Sydney, New South Wales 2006, Australia
| | - Simon Carlile
- Auditory Neuroscience Laboratory, University of Sydney, New South Wales 2006, Australia
| |
Collapse
|
14
|
Brinkmann F, Aspöck L, Ackermann D, Lepa S, Vorländer M, Weinzierl S. A round robin on room acoustical simulation and auralization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2746. [PMID: 31046379 DOI: 10.1121/1.5096178] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 03/12/2019] [Indexed: 06/09/2023]
Abstract
A round robin was conducted to evaluate the state of the art of room acoustic modeling software both in the physical and perceptual realms. The test was based on six acoustic scenes highlighting specific acoustic phenomena and for three complex, "real-world" spatial environments. The results demonstrate that most present simulation algorithms generate obvious model errors once the assumptions of geometrical acoustics are no longer met. As a consequence, they are neither able to provide a reliable pattern of early reflections nor do they provide a reliable prediction of room acoustic parameters outside a medium frequency range. In the perceptual domain, the algorithms under test could generate mostly plausible but not authentic auralizations, i.e., the difference between simulated and measured impulse responses of the same scene was always clearly audible. Most relevant for this perceptual difference are deviations in tone color and source position between measurement and simulation, which to a large extent can be traced back to the simplified use of random incidence absorption and scattering coefficients and shortcomings in the simulation of early reflections due to the missing or insufficient modeling of diffraction.
Collapse
Affiliation(s)
- Fabian Brinkmann
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, Berlin, D-10587, Germany
| | - Lukas Aspöck
- Institute of Technical Acoustics, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University, Kopernikusstraße 5, Aachen, D-52074, Germany
| | - David Ackermann
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, Berlin, D-10587, Germany
| | - Steffen Lepa
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, Berlin, D-10587, Germany
| | - Michael Vorländer
- Institute of Technical Acoustics, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University, Kopernikusstraße 5, Aachen, D-52074, Germany
| | - Stefan Weinzierl
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, Berlin, D-10587, Germany
| |
Collapse
|
15
|
Ahrens A, Lund KD, Marschall M, Dau T. Sound source localization with varying amount of visual information in virtual reality. PLoS One 2019; 14:e0214603. [PMID: 30925174 PMCID: PMC6440636 DOI: 10.1371/journal.pone.0214603] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/17/2019] [Indexed: 12/05/2022] Open
Abstract
To achieve accurate spatial auditory perception, subjects typically require personal head-related transfer functions (HRTFs) and the freedom for head movements. Loudspeaker-based virtual sound environments allow for realism without individualized measurements. To study audio-visual perception in realistic environments, the combination of spatially tracked head mounted displays (HMDs), also known as virtual reality glasses, and virtual sound environments may be valuable. However, HMDs were recently shown to affect the subjects’ HRTFs and thus might influence sound localization performance. Furthermore, due to limitations of the reproduction of visual information on the HMD, audio-visual perception might be influenced. Here, a sound localization experiment was conducted both with and without an HMD and with a varying amount of visual information provided to the subjects. Furthermore, interaural time and level difference errors (ITDs and ILDs) as well as spectral perturbations induced by the HMD were analyzed and compared to the perceptual localization data. The results showed a reduction of the localization accuracy when the subjects were wearing an HMD and when they were blindfolded. The HMD-induced error in azimuth localization was found to be larger in the left than in the right hemisphere. When visual information of the limited set of source locations was provided, the localization error induced by the HMD was found to be negligible. Presenting visual information of hand-location and room dimensions showed better sound localization performance compared to the condition with no visual information. The addition of possible source locations further improved the localization accuracy. Also adding pointing feedback in form of a virtual laser pointer improved the accuracy of elevation perception but not of azimuth perception.
Collapse
Affiliation(s)
- Axel Ahrens
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
- * E-mail:
| | - Kasper Duemose Lund
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Marton Marschall
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
| |
Collapse
|
16
|
Zonooz B, Arani E, Körding KP, Aalbers PATR, Celikel T, Van Opstal AJ. Spectral Weighting Underlies Perceived Sound Elevation. Sci Rep 2019; 9:1642. [PMID: 30733476 PMCID: PMC6367479 DOI: 10.1038/s41598-018-37537-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 12/07/2018] [Indexed: 11/09/2022] Open
Abstract
The brain estimates the two-dimensional direction of sounds from the pressure-induced displacements of the eardrums. Accurate localization along the horizontal plane (azimuth angle) is enabled by binaural difference cues in timing and intensity. Localization along the vertical plane (elevation angle), including frontal and rear directions, relies on spectral cues made possible by the elevation dependent filtering in the idiosyncratic pinna cavities. However, the problem of extracting elevation from the sensory input is ill-posed, since the spectrum results from a convolution between source spectrum and the particular head-related transfer function (HRTF) associated with the source elevation, which are both unknown to the system. It is not clear how the auditory system deals with this problem, or which implicit assumptions it makes about source spectra. By varying the spectral contrast of broadband sounds around the 6–9 kHz band, which falls within the human pinna’s most prominent elevation-related spectral notch, we here suggest that the auditory system performs a weighted spectral analysis across different frequency bands to estimate source elevation. We explain our results by a model, in which the auditory system weighs the different spectral bands, and compares the convolved weighted sensory spectrum with stored information about its own HRTFs, and spatial prior assumptions.
Collapse
Affiliation(s)
- Bahram Zonooz
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - Elahe Arani
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - Konrad P Körding
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA.,Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
| | - P A T Remco Aalbers
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - Tansu Celikel
- Neurophysiology Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands
| | - A John Van Opstal
- Biophysics Department, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6525 AJ, Nijmegen, The Netherlands.
| |
Collapse
|
17
|
Zonooz B, Arani E, Opstal AJV. Learning to localise weakly-informative sound spectra with and without feedback. Sci Rep 2018; 8:17933. [PMID: 30560940 PMCID: PMC6298951 DOI: 10.1038/s41598-018-36422-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 11/20/2018] [Indexed: 11/12/2022] Open
Abstract
How the human auditory system learns to map complex pinna-induced spectral-shape cues onto veridical estimates of sound-source elevation in the median plane is still unclear. Earlier studies demonstrated considerable sound-localisation plasticity after applying pinna moulds, and to altered vision. Several factors may contribute to auditory spatial learning, like visual or motor feedback, or updated priors. We here induced perceptual learning for sounds with degraded spectral content, having weak, but consistent, elevation-dependent cues, as demonstrated by low-gain stimulus-response relations. During training, we provided visual feedback for only six targets in the midsagittal plane, to which listeners gradually improved their response accuracy. Interestingly, listeners' performance also improved without visual feedback, albeit less strongly. Post-training results showed generalised improved response behaviour, also to non-trained locations and acoustic spectra, presented throughout the two-dimensional frontal hemifield. We argue that the auditory system learns to reweigh contributions from low-informative spectral bands to update its prior elevation estimates, and explain our results with a neuro-computational model.
Collapse
Affiliation(s)
- Bahram Zonooz
- Biophysics Department, Donders Center for Neuroscience, Radboud University, Heyendaalseweg 135, 6525, AJ, Nijmegen, The Netherlands
| | - Elahe Arani
- Biophysics Department, Donders Center for Neuroscience, Radboud University, Heyendaalseweg 135, 6525, AJ, Nijmegen, The Netherlands
| | - A John Van Opstal
- Biophysics Department, Donders Center for Neuroscience, Radboud University, Heyendaalseweg 135, 6525, AJ, Nijmegen, The Netherlands.
| |
Collapse
|
18
|
Abstract
Ambisonics has enjoyed a recent resurgence in popularity due to virtual reality applications. Low order Ambisonic reproduction is inherently inaccurate at high frequencies, which causes poor timbre and height localisation. Diffuse-Field Equalisation (DFE), the theory of removing direction-independent frequency response, is applied to binaural (over headphones) Ambisonic rendering to address high-frequency reproduction. DFE of Ambisonics is evaluated by comparing binaural Ambisonic rendering to direct convolution via head-related impulse responses (HRIRs) in three ways: spectral difference, predicted sagittal plane localisation and perceptual listening tests on timbre. Results show DFE successfully improves frequency reproduction of binaural Ambisonic rendering for the majority of sound source locations, as well as the limitations of the technique, and set the basis for further research in the field.
Collapse
|
19
|
Denk F, Ewert SD, Kollmeier B. Spectral directional cues captured by hearing device microphones in individual human ears. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2072. [PMID: 30404454 DOI: 10.1121/1.5056173] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Accepted: 09/11/2018] [Indexed: 06/08/2023]
Abstract
Spatial hearing abilities with hearing devices ultimately depend on how well acoustic directional cues are captured by the microphone(s) of the device. A comprehensive objective evaluation of monaural spectral directional cues captured at 9 microphone locations integrated in 5 hearing device styles is presented, utilizing a recent database of head-related transfer functions (HRTFs) that includes data from 16 human and 3 artificial ear pairs. Differences between HRTFs to the eardrum and hearing device microphones were assessed by descriptive analyses and quantitative metrics, and compared to differences between individual ears. Directional information exploited for vertical sound localization was evaluated by means of computational models. Directional information at microphone locations inside the pinna is significantly biased and qualitatively poorer compared to locations in the ear canal; behind-the-ear microphones capture almost no directional cues. These errors are expected to impair vertical sound localization, even if the new cues would be optimally mapped to locations. Differences between HRTFs to the eardrum and hearing device microphones are qualitatively different from between-subject differences and can be described as a partial destruction rather than an alteration of relevant cues, although spectral difference metrics produce similar results. Dummy heads do not fully reflect the results with individual subjects.
Collapse
Affiliation(s)
- Florian Denk
- Medizinische Physik and Cluster of Excellence "Hearing4all," University of Oldenburg, Küpkersweg 74, 26129 Oldenburg, Germany
| | - Stephan D Ewert
- Medizinische Physik and Cluster of Excellence "Hearing4all," University of Oldenburg, Küpkersweg 74, 26129 Oldenburg, Germany
| | - Birger Kollmeier
- Medizinische Physik and Cluster of Excellence "Hearing4all," University of Oldenburg, Küpkersweg 74, 26129 Oldenburg, Germany
| |
Collapse
|
20
|
Watson CJG, Carlile S, Kelly H, Balachandar K. The Generalization of Auditory Accommodation to Altered Spectral Cues. Sci Rep 2017; 7:11588. [PMID: 28912440 PMCID: PMC5599623 DOI: 10.1038/s41598-017-11981-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Accepted: 08/30/2017] [Indexed: 11/23/2022] Open
Abstract
The capacity of healthy adult listeners to accommodate to altered spectral cues to the source locations of broadband sounds has now been well documented. In recent years we have demonstrated that the degree and speed of accommodation are improved by using an integrated sensory-motor training protocol under anechoic conditions. Here we demonstrate that the learning which underpins the localization performance gains during the accommodation process using anechoic broadband training stimuli generalize to environmentally relevant scenarios. As previously, alterations to monaural spectral cues were produced by fitting participants with custom-made outer ear molds, worn during waking hours. Following acute degradations in localization performance, participants then underwent daily sensory-motor training to improve localization accuracy using broadband noise stimuli over ten days. Participants not only demonstrated post-training improvements in localization accuracy for broadband noises presented in the same set of positions used during training, but also for stimuli presented in untrained locations, for monosyllabic speech sounds, and for stimuli presented in reverberant conditions. These findings shed further light on the neuroplastic capacity of healthy listeners, and represent the next step in the development of training programs for users of assistive listening devices which degrade localization acuity by distorting or bypassing monaural cues.
Collapse
Affiliation(s)
- Christopher J G Watson
- School of Medical Sciences, University of Sydney, Sydney, New South Wales, 2006, Australia.
| | - Simon Carlile
- School of Medical Sciences, University of Sydney, Sydney, New South Wales, 2006, Australia
| | - Heather Kelly
- School of Medical Sciences, University of Sydney, Sydney, New South Wales, 2006, Australia
| | - Kapilesh Balachandar
- School of Medical Sciences, University of Sydney, Sydney, New South Wales, 2006, Australia
| |
Collapse
|
21
|
Asymmetries in behavioral and neural responses to spectral cues demonstrate the generality of auditory looming bias. Proc Natl Acad Sci U S A 2017; 114:9743-9748. [PMID: 28827336 DOI: 10.1073/pnas.1703247114] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Studies of auditory looming bias have shown that sources increasing in intensity are more salient than sources decreasing in intensity. Researchers have argued that listeners are more sensitive to approaching sounds compared with receding sounds, reflecting an evolutionary pressure. However, these studies only manipulated overall sound intensity; therefore, it is unclear whether looming bias is truly a perceptual bias for changes in source distance, or only in sound intensity. Here we demonstrate both behavioral and neural correlates of looming bias without manipulating overall sound intensity. In natural environments, the pinnae induce spectral cues that give rise to a sense of externalization; when spectral cues are unnatural, sounds are perceived as closer to the listener. We manipulated the contrast of individually tailored spectral cues to create sounds of similar intensity but different naturalness. We confirmed that sounds were perceived as approaching when spectral contrast decreased, and perceived as receding when spectral contrast increased. We measured behavior and electroencephalography while listeners judged motion direction. Behavioral responses showed a looming bias in that responses were more consistent for sounds perceived as approaching than for sounds perceived as receding. In a control experiment, looming bias disappeared when spectral contrast changes were discontinuous, suggesting that perceived motion in distance and not distance itself was driving the bias. Neurally, looming bias was reflected in an asymmetry of late event-related potentials associated with motion evaluation. Hence, both our behavioral and neural findings support a generalization of the auditory looming bias, representing a perceptual preference for approaching auditory objects.
Collapse
|
22
|
Ben-Hur Z, Brinkmann F, Sheaffer J, Weinzierl S, Rafaely B. Spectral equalization in binaural signals represented by order-truncated spherical harmonics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:4087. [PMID: 28618825 PMCID: PMC5457295 DOI: 10.1121/1.4983652] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Revised: 04/14/2017] [Accepted: 05/02/2017] [Indexed: 06/07/2023]
Abstract
The synthesis of binaural signals from spherical microphone array recordings has been recently proposed. The limited spatial resolution of the reproduced signal due to order-limited reproduction has been previously investigated perceptually, showing spatial perception ramifications, such as poor source localization and limited externalization. Furthermore, this spatial order limitation also has a detrimental effect on the frequency content of the signal and its perceived timbre, due to the rapid roll-off at high frequencies. In this paper, the underlying causes of this spectral roll-off are described mathematically and investigated numerically. A digital filter that equalizes the frequency spectrum of a low spatial order signal is introduced and evaluated. A comprehensive listening test was conducted to study the influence of the filter on the perception of the reproduced sound. Results indicate that the suggested filter is beneficial for restoring the timbral composition of order-truncated binaural signals, while conserving, and even improving, some spatial properties of the signal.
Collapse
Affiliation(s)
- Zamir Ben-Hur
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Fabian Brinkmann
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, D-10587 Berlin, Germany
| | - Jonathan Sheaffer
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Stefan Weinzierl
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, D-10587 Berlin, Germany
| | - Boaz Rafaely
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
23
|
Joubaud T, Zimpfer V, Garcia A, Langrenne C. Sound localization models as evaluation tools for tactical communication and protective systems. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:2637. [PMID: 28464634 DOI: 10.1121/1.4979693] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Tactical Communication and Protective Systems (TCAPS) are hearing protection devices that sufficiently protect the listener's ears from hazardous sounds and preserve speech intelligibility. However, previous studies demonstrated that TCAPS still deteriorate the listener's situational awareness, in particular, the ability to locate sound sources. On the horizontal plane, this is mainly explained by the degradation of the acoustical cues normally preventing the listener from making front-back confusions. As part of TCAPS development and assessment, a method predicting the TCAPS-induced degradation of the sound localization capability based on electroacoustic measurements would be more suitable than time-consuming behavioral experiments. In this context, the present paper investigates two methods based on Head-Related Transfer Functions (HRTFs): a template-matching model and a three-layer neural network. They are optimized to fit human sound source identification performance in open ear condition. The methods are applied to HRTFs measured with six TCAPS, providing identification probabilities. They are compared with the results of a behavioral experiment, conducted with the same protectors, and which ranks the TCAPS by type. The neural network predicts realistic performances with earplugs, but overestimates errors with earmuffs. The template-matching model predicts human performance well, except for two particular TCAPS.
Collapse
Affiliation(s)
- Thomas Joubaud
- Acoustics and Protection of the Soldier, French-German Research Institute of Saint-Louis, 5 rue du Général Cassagnou, BP 70034, 68301 Saint-Louis, France
| | - Véronique Zimpfer
- Acoustics and Protection of the Soldier, French-German Research Institute of Saint-Louis, 5 rue du Général Cassagnou, BP 70034, 68301 Saint-Louis, France
| | - Alexandre Garcia
- Laboratoire de Mécanique des Structures et des Systèmes Couplés, Conservatoire National des Arts et Métiers, 292 rue Saint-Martin, 75141 Paris Cedex 03, France
| | - Christophe Langrenne
- Laboratoire de Mécanique des Structures et des Systèmes Couplés, Conservatoire National des Arts et Métiers, 292 rue Saint-Martin, 75141 Paris Cedex 03, France
| |
Collapse
|
24
|
Simon LSR, Zacharov N, Katz BFG. Perceptual attributes for the comparison of head-related transfer functions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:3623. [PMID: 27908072 DOI: 10.1121/1.4966115] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The benefit of using individual head-related transfer functions (HRTFs) in binaural audio is well documented with regards to improving localization precision. However, with the increased use of binaural audio in more complex scene renderings, cognitive studies, and virtual and augmented reality simulations, the perceptual impact of HRTF selection may go beyond simple localization. In this study, the authors develop a list of attributes which qualify the perceived differences between HRTFs, providing a qualitative understanding of the perceptual variance of non-individual binaural renderings. The list of attributes was designed using a Consensus Vocabulary Protocol elicitation method. Participants followed an Individual Vocabulary Protocol elicitation procedure, describing the perceived differences between binaural stimuli based on binauralized extracts of multichannel productions. This was followed by an automated lexical reduction and a series of consensus group meetings during which participants agreed on a list of relevant attributes. Finally, the proposed list of attributes was then evaluated through a listening test, leading to eight valid perceptual attributes for describing the perceptual dimensions affected by HRTF set variations.
Collapse
Affiliation(s)
- Laurent S R Simon
- Audio Acoustics Group, LIMSI, CNRS, Université Paris-Saclay, 91405 Orsay, France
| | | | - Brian F G Katz
- Audio Acoustics Group, LIMSI, CNRS, Université Paris-Saclay, 91405 Orsay, France
| |
Collapse
|
25
|
Baumgartner R, Majdak P, Laback B. Modeling the Effects of Sensorineural Hearing Loss on Sound Localization in the Median Plane. Trends Hear 2016; 20:20/0/2331216516662003. [PMID: 27659486 PMCID: PMC5055367 DOI: 10.1177/2331216516662003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Listeners use monaural spectral cues to localize sound sources in sagittal planes (along the up-down and front-back directions). How sensorineural hearing loss affects the salience of monaural spectral cues is unclear. To simulate the effects of outer-hair-cell (OHC) dysfunction and the contribution of different auditory-nerve fiber types on localization performance, we incorporated a nonlinear model of the auditory periphery into a model of sagittal-plane sound localization for normal-hearing listeners. The localization model was first evaluated in its ability to predict the effects of spectral cue modifications for normal-hearing listeners. Then, we used it to simulate various degrees of OHC dysfunction applied to different types of auditory-nerve fibers. Predicted localization performance was hardly affected by mild OHC dysfunction but was strongly degraded in conditions involving severe and complete OHC dysfunction. These predictions resemble the usually observed degradation in localization performance induced by sensorineural hearing loss. Predicted localization performance was best when preserving fibers with medium spontaneous rates, which is particularly important in view of noise-induced hearing loss associated with degeneration of this fiber type. On average across listeners, predicted localization performance was strongly related to level discrimination sensitivity of auditory-nerve fibers, indicating an essential role of this coding property for localization accuracy in sagittal planes.
Collapse
Affiliation(s)
- Robert Baumgartner
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Bernhard Laback
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| |
Collapse
|
26
|
Harder S, Paulsen RR, Larsen M, Laugesen S, Mihocic M, Majdak P. A framework for geometry acquisition, 3-D printing, simulation, and measurement of head-related transfer functions with a focus on hearing-assistive devices. COMPUTER AIDED DESIGN 2016; 75-76:39-46. [PMID: 28239188 PMCID: PMC5321480 DOI: 10.1016/j.cad.2016.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Individual head-related transfer functions (HRTFs) are essential in applications like fitting hearing-assistive devices (HADs) for providing accurate sound localization performance. Individual HRTFs are usually obtained through intricate acoustic measurements. This paper investigates the use of a three-dimensional (3D) head model for acquisition of individual HRTFs. Two aspects were investigated; whether a 3D-printed model can replace measurements on a human listener and whether numerical simulations can replace acoustic measurements. For this purpose, HRTFs were acoustically measured for four human listeners and for a 3D printed head model of one of these listeners. Further, HRTFs were simulated by applying the finite element method to the 3D head model. The monaural spectral features and spectral distortions were very similar between re-measurements and between human and printed measurements, however larger deviations were observed between measurement and simulation. The binaural cues were in agreement among all HRTFs of the same listener, indicating that the 3D model is able to provide localization cues potentially accessible to HAD users. Hence, the pipeline of geometry acquisition, printing, and acoustic measurements or simulations, seems to be a promising step forward towards in-silico design of HADs.
Collapse
Affiliation(s)
- Stine Harder
- Technical University of Denmark, DTU Compute, DK-2800 Lyngby, Denmark
| | - Rasmus R. Paulsen
- Technical University of Denmark, DTU Compute, DK-2800 Lyngby, Denmark
| | | | | | - Michael Mihocic
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| |
Collapse
|
27
|
Baumgartner R, Majdak P. Modeling Localization of Amplitude-Panned Virtual Sources in Sagittal Planes. JOURNAL OF THE AUDIO ENGINEERING SOCIETY. AUDIO ENGINEERING SOCIETY 2015; 63:562-569. [PMID: 26441471 PMCID: PMC4591473 DOI: 10.17743/jaes.2015.0063] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Vector-base amplitude panning (VBAP) aims at creating virtual sound sources at arbitrary directions within multichannel sound reproduction systems. However, VBAP does not consistently produce listener-specific monaural spectral cues that are essential for localization of sound sources in sagittal planes, including the front-back and up-down dimensions. In order to better understand the limitations of VBAP, a functional model approximating human processing of spectro-spatial information was applied to assess accuracy in sagittal-plane localization of virtual sources created by means of VBAP. First, we evaluated VBAP applied on two loudspeakers in the median plane, and then we investigated the directional dependence of the localization accuracy in several three-dimensional loudspeaker arrangements designed in layers of constant elevation. The model predicted a strong dependence on listeners' individual head-related transfer functions, on virtual source directions, and on loudspeaker arrangements. In general, the simulations showed a systematic degradation with increasing polar-angle span between neighboring loudspeakers. For the design of VBAP systems, predictions suggest that spans up to 40° polar angle yield a good trade-off between system complexity and localization accuracy. Special attention should be paid to the frontal region where listeners are most sensitive to deviating spectral cues.
Collapse
Affiliation(s)
- Robert Baumgartner
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| |
Collapse
|
28
|
Marelli D, Baumgartner R, Majdak P. Efficient Approximation of Head-Related Transfer Functions in Subbands for Accurate Sound Localization. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 2015; 23:1130-1143. [PMID: 26681930 PMCID: PMC4678625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology and are essential for listeners to localize sound sources in virtual auditory displays. Since rendering complex virtual scenes is computationally demanding, we propose four algorithms for efficiently representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. All four algorithms use sparse approximation procedures to minimize the computational complexity while maintaining perceptually relevant HRTF properties. The first two algorithms separately optimize the complexity of the transfer matrix associated to each HRTF for fixed FBs. The other two algorithms jointly optimize the FBs and transfer matrices for complete HRTF sets by two variants. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments investigate the latency-complexity trade-off and show that the proposed methods offer significant computational savings when compared with other available approaches. Psychoacoustic localization experiments were modeled and conducted to find a reasonable approximation tolerance so that no significant localization performance degradation was introduced by the subband representation.
Collapse
Affiliation(s)
- Damián Marelli
- School of Electrical Engineering and Computer Science, University of Newcastle, Callaghan, NSW 2308, Australia; Acoustics Research Institute, Austrian Academy of Sciences, Austria ( )
| | - Robert Baumgartner
- Acoustics Research Institute, Austrian Academy of Sciences, 1040 Vienna, Austria ( ; )
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, 1040 Vienna, Austria ( ; )
| |
Collapse
|
29
|
Majdak P, Baumgartner R, Laback B. Acoustic and non-acoustic factors in modeling listener-specific performance of sagittal-plane sound localization. Front Psychol 2014; 5:319. [PMID: 24795672 PMCID: PMC4006033 DOI: 10.3389/fpsyg.2014.00319] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 03/27/2014] [Indexed: 11/13/2022] Open
Abstract
The ability of sound-source localization in sagittal planes (along the top-down and front-back dimension) varies considerably across listeners. The directional acoustic spectral features, described by head-related transfer functions (HRTFs), also vary considerably across listeners, a consequence of the listener-specific shape of the ears. It is not clear whether the differences in localization ability result from differences in the encoding of directional information provided by the HRTFs, i.e., an acoustic factor, or from differences in auditory processing of those cues (e.g., spectral-shape sensitivity), i.e., non-acoustic factors. We addressed this issue by analyzing the listener-specific localization ability in terms of localization performance. Directional responses to spatially distributed broadband stimuli from 18 listeners were used. A model of sagittal-plane localization was fit individually for each listener by considering the actual localization performance, the listener-specific HRTFs representing the acoustic factor, and an uncertainty parameter representing the non-acoustic factors. The model was configured to simulate the condition of complete calibration of the listener to the tested HRTFs. Listener-specifically calibrated model predictions yielded correlations of, on average, 0.93 with the actual localization performance. Then, the model parameters representing the acoustic and non-acoustic factors were systematically permuted across the listener group. While the permutation of HRTFs affected the localization performance, the permutation of listener-specific uncertainty had a substantially larger impact. Our findings suggest that across-listener variability in sagittal-plane localization ability is only marginally determined by the acoustic factor, i.e., the quality of directional cues found in typical human HRTFs. Rather, the non-acoustic factors, supposed to represent the listeners' efficiency in processing directional cues, appear to be important.
Collapse
Affiliation(s)
- Piotr Majdak
- Psychoacoustics and Experimental Audiology, Acoustics Research Institute, Austrian Academy of Sciences Wien, Austria
| | - Robert Baumgartner
- Psychoacoustics and Experimental Audiology, Acoustics Research Institute, Austrian Academy of Sciences Wien, Austria
| | - Bernhard Laback
- Psychoacoustics and Experimental Audiology, Acoustics Research Institute, Austrian Academy of Sciences Wien, Austria
| |
Collapse
|