1
|
Neal MT, Zahorik P. The impact of head-related impulse response delay treatment strategy on psychoacoustic cue reconstruction errors from virtual loudspeaker arrays. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:3729. [PMID: 35778188 DOI: 10.1121/10.0011588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 05/18/2022] [Indexed: 06/15/2023]
Abstract
Known errors exist in loudspeaker array processing techniques, often degrading source localization and timbre. The goal of the present study was to use virtual loudspeaker arrays to investigate how treatment of the interaural time delay (ITD) cue from each loudspeaker impacts these errors. Virtual loudspeaker arrays rendered over headphones using head-related impulse responses (HRIRs) allow flexible control of array size. Here, three HRIR delay treatment strategies were evaluated using minimum-phase loudspeaker HRIRs: reapplying the original HRIR delays, applying the relative ITD to the contralateral ear, or separately applying the HRIR delays prior to virtual array processing. Seven array sizes were simulated, and panning techniques were used to estimate HRIRs from 3000 directions using higher-order Ambisonics, vector-base amplitude panning, and the closest loudspeaker technique. Compared to a traditional, physical array, the prior HRIR delay treatment strategy produced similar errors with a 95% reduction in the required array size. When compared to direct spherical harmonic (SH) fitting of head-related transfer functions (HRTFs), the prior delays strategy reduced errors in reconstruction accuracy of timbral and directional psychoacoustic cues. This result suggests that delay optimization can greatly reduce the number of virtual loudspeakers required for accurate rendering of acoustic scenes without SH-based HRTF representation.
Collapse
Affiliation(s)
- Matthew T Neal
- Department of Otolaryngology and Communicative Disorders, The University of Louisville, 117 East Kentucky Street, Louisville, Kentucky 40203, USA
| | - Pavel Zahorik
- Department of Otolaryngology and Communicative Disorders, The University of Louisville, 117 East Kentucky Street, Louisville, Kentucky 40203, USA
| |
Collapse
|
2
|
Ifergan I, Rafaely B. On the selection of the number of beamformers in beamforming-based binaural reproduction. EURASIP JOURNAL ON AUDIO, SPEECH, AND MUSIC PROCESSING 2022; 2022:6. [PMID: 35371191 PMCID: PMC8965231 DOI: 10.1186/s13636-022-00238-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 02/14/2022] [Indexed: 06/14/2023]
Abstract
In recent years, spatial audio reproduction has been widely researched with many studies focusing on headphone-based spatial reproduction. A popular format for spatial audio is higher order Ambisonics (HOA), where a spherical microphone array is typically used to obtain the HOA signals. When a spherical array is not available, beamforming-based binaural reproduction (BFBR) can be used, where signals are captured with arrays of a general configuration. While shown to be useful, no comprehensive studies of BFBR have been presented and so its limitations and other design aspects are not well understood. This paper takes an initial step towards developing a theory for BFBR and develops guidelines for selecting the number of beamformers. In particular, the average directivity factor of the microphone array is proposed as a measure for supporting this selection. The effect of head-related transfer function (HRTF) order truncation that occurs when using too many beamformer directions is presented and studied. In addition, the relation between HOA-based binaural reproduction and BFBR is discussed through analysis based on a spherical array. A simulation study is then presented, based on both a spherical and a planar array, demonstrating the proposed guidelines. A listening test verifies the perceptual attributes of the methods presented in this study. These results can be used for more informed beamformer design for BFBR.
Collapse
Affiliation(s)
- Itay Ifergan
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Be’er Sheva, Israel
| | - Boaz Rafaely
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Be’er Sheva, Israel
| |
Collapse
|
3
|
Urban Sound Auralization and Visualization Framework—Case Study at IHTApark. SUSTAINABILITY 2022. [DOI: 10.3390/su14042026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In the context of acoustic urban planning, the use of noise mappings is a worldwide well-established practice. Therefore, the noise levels in an urban environment are calculated based on models of the sound sources, models of the physical sound propagation effects and the position of the receivers in the area of interest. However, the noise mapping method is limited to sound levels in frequency bands due to missing temporal and spectral information of the sound signals. This, in turn, leads to missing information about the qualitative sound properties, as they can be evaluated in psychoacoustic parameters. Beyond the scope of the classical noise mapping, auralization and physically-based simulation of sound fields can be applied to urban scenarios in the context of urban soundscape analysis. By supporting the auralization technology with a visual counterpart of the urban space, a plausible virtual representation of a real environment can be achieved. The presented framework combines the possibilities of the open-source auralization tool Virtual Acoustics with 3D visualization. In order to enable studies with natural human response or for public communication of urban design projects, those virtual scenes can be either reproduced with immersive technologies—such head-mounted displays (HMD)—or using online video platforms and traditional playback devices. The paper presents an overview of what physical principles can already be simulated, which technological considerations need to be taken into account, and how to set up such environment for auralization and visualization of urban scenes. We present the framework by the case study of IHTApark.
Collapse
|
4
|
Ahrens J, Andersson C. Perceptual evaluation of headphone auralization of rooms captured with spherical microphone arrays with respect to spaciousness and timbre. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2783. [PMID: 31046319 DOI: 10.1121/1.5096164] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Accepted: 01/08/2019] [Indexed: 06/09/2023]
Abstract
A listening experiment is presented in which subjects rated the perceived differences in terms of spaciousness and timbre between a headphone-based headtracked dummy head auralization of a sound source in different rooms and a headphone-based headtracked auralization of a spherical microphone array recording of the same scenario. The underlying auralizations were based on measured impulse responses to assure equal conditions. Rigid-sphere arrays with different amounts of microphones ranging from 50 to up to 1202 were emulated through sequential measurements, and spherical harmonics orders of up to 12 were tested. The results show that the array auralizations are partially indistinguishable from the direct dummy head auralization at a spherical harmonics order of 8 or higher if the virtual sound source is located at a lateral position. No significant reduction of the perceived differences with increasing order is observed for frontal virtual sound sources. In this case, small differences with respect to both spaciousness and timbre persist. The evaluation of lowpass-filtered stimuli shows that the perceived differences occur exclusively at higher frequencies and can therefore be attributed to spatial aliasing. The room had only a minor effect on the results.
Collapse
Affiliation(s)
- Jens Ahrens
- Audio Technology Group, Division of Applied Acoustics, Chalmers University of Technology, 412 96 Gothenburg, Sweden
| | - Carl Andersson
- Audio Technology Group, Division of Applied Acoustics, Chalmers University of Technology, 412 96 Gothenburg, Sweden
| |
Collapse
|
5
|
Interaural Level Difference Optimization of Binaural Ambisonic Rendering. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9061226] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Ambisonics is a spatial audio technique appropriate for dynamic binaural rendering due to its sound field rotation and transformation capabilities, which has made it popular for virtual reality applications. An issue with low-order Ambisonics is that interaural level differences (ILDs) are often reproduced with lower values when compared to head-related impulse responses (HRIRs), which reduces lateralization and spaciousness. This paper introduces a method of Ambisonic ILD Optimization (AIO), a pre-processing technique to bring the ILDs produced by virtual loudspeaker binaural Ambisonic rendering closer to those of HRIRs. AIO is evaluated objectively for Ambisonic orders up to fifth order versus a reference dataset of HRIRs for all locations on the sphere via estimated ILD and spectral difference, and perceptually through listening tests using both simple and complex scenes. Results conclude AIO produces an overall improvement for all tested orders of Ambisonics, though the benefits are greatest at first and second order.
Collapse
|
6
|
Abstract
Ambisonics has enjoyed a recent resurgence in popularity due to virtual reality applications. Low order Ambisonic reproduction is inherently inaccurate at high frequencies, which causes poor timbre and height localisation. Diffuse-Field Equalisation (DFE), the theory of removing direction-independent frequency response, is applied to binaural (over headphones) Ambisonic rendering to address high-frequency reproduction. DFE of Ambisonics is evaluated by comparing binaural Ambisonic rendering to direct convolution via head-related impulse responses (HRIRs) in three ways: spectral difference, predicted sagittal plane localisation and perceptual listening tests on timbre. Results show DFE successfully improves frequency reproduction of binaural Ambisonic rendering for the majority of sound source locations, as well as the limitations of the technique, and set the basis for further research in the field.
Collapse
|
7
|
Wang Y, Chen K. Translations of spherical harmonics expansion coefficients for a sound field using plane wave expansions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:3474. [PMID: 29960480 DOI: 10.1121/1.5041742] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
A translation method for the spherical harmonics expansion coefficients of a sound field using plane wave expansions is proposed. It is based on the decomposition of a plane wave in the spherical harmonics domain and without the use of the spherical harmonics addition theorem, thus is very computationally efficient. Simulations are conducted for validation. The stabilities of the translations are compared with the conventional method in terms of matrix condition numbers. The proposed method is demonstrated to be more robust as the frequency increases and when upscaling the coefficients. Besides, the computation is much faster when high truncated orders are solved.
Collapse
Affiliation(s)
- Yan Wang
- Department of Environmental Engineering, School of Marine Science and Technology, Northwestern Polytechnical University, 127 Youyi West Road, Xi'an, Shaanxi 710072, China
| | - Kean Chen
- Department of Environmental Engineering, School of Marine Science and Technology, Northwestern Polytechnical University, 127 Youyi West Road, Xi'an, Shaanxi 710072, China
| |
Collapse
|
8
|
Zaunschirm M, Schörkhuber C, Höldrich R. Binaural rendering of Ambisonic signals by head-related impulse response time alignment and a diffuseness constraint. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:3616. [PMID: 29960468 DOI: 10.1121/1.5040489] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Binaural rendering of Ambisonic signals is of great interest in the fields of virtual reality, immersive media, and virtual acoustics. Typically, the spatial order of head-related impulse responses (HRIRs) is considerably higher than the order of the Ambisonic signals. The resulting order reduction of the HRIRs has a detrimental effect on the binaurally rendered signals, and perceptual evaluations indicate limited externalization, localization accuracy, and altered timbre. In this contribution, a binaural renderer, which is computed using a frequency-dependent time alignment of HRIRs followed by a minimization of the squared error subject to a diffuse-field covariance matrix constraint, is presented. The frequency-dependent time alignment retains the interaural time difference (at low frequencies) and results in a HRIR set with lower spatial complexity, while the constrained optimization controls the diffuse-field behavior. Technical evaluations in terms of sound coloration, interaural level differences, diffuse-field response, and interaural coherence, as well as findings from formal listening experiments show a significant improvement of the proposed method compared to state-of-the-art methods.
Collapse
Affiliation(s)
- Markus Zaunschirm
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts, Graz, 8010, Austria
| | - Christian Schörkhuber
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts, Graz, 8010, Austria
| | - Robert Höldrich
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts, Graz, 8010, Austria
| |
Collapse
|
9
|
Ben-Hur Z, Brinkmann F, Sheaffer J, Weinzierl S, Rafaely B. Spectral equalization in binaural signals represented by order-truncated spherical harmonics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:4087. [PMID: 28618825 PMCID: PMC5457295 DOI: 10.1121/1.4983652] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Revised: 04/14/2017] [Accepted: 05/02/2017] [Indexed: 06/07/2023]
Abstract
The synthesis of binaural signals from spherical microphone array recordings has been recently proposed. The limited spatial resolution of the reproduced signal due to order-limited reproduction has been previously investigated perceptually, showing spatial perception ramifications, such as poor source localization and limited externalization. Furthermore, this spatial order limitation also has a detrimental effect on the frequency content of the signal and its perceived timbre, due to the rapid roll-off at high frequencies. In this paper, the underlying causes of this spectral roll-off are described mathematically and investigated numerically. A digital filter that equalizes the frequency spectrum of a low spatial order signal is introduced and evaluated. A comprehensive listening test was conducted to study the influence of the filter on the perception of the reproduced sound. Results indicate that the suggested filter is beneficial for restoring the timbral composition of order-truncated binaural signals, while conserving, and even improving, some spatial properties of the signal.
Collapse
Affiliation(s)
- Zamir Ben-Hur
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Fabian Brinkmann
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, D-10587 Berlin, Germany
| | - Jonathan Sheaffer
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Stefan Weinzierl
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, D-10587 Berlin, Germany
| | - Boaz Rafaely
- Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|