Abstract

With the proliferation of high quality virtual reality systems, the demand for high fidelity spatial audio reproduction has grown. This requires individual head-related transfer functions (HRTFs) with high spatial resolution. Acquiring such HRTFs is not always possible, which motivates the need for sparsely sampled HRTFs. Additionally, real-time applications require compact representation of HRTFs. Recently, spherical-harmonics (SH) has been suggested for efficient interpolation and representation of HRTFs. However, representation of sparse HRTFs with a limited SH order may introduce spatial aliasing and truncation errors, which have a detrimental effect on the reproduced spatial audio. This is because the HRTF is inherently of a high spatial order. One approach to overcome this limitation is to pre-process the HRTF, with the aim of reducing its effective SH order. A recent study showed that order-reduction can be achieved by time-alignment of HRTFs, through numerical estimation of the time delays of the HRTFs. In this paper, a new method for pre-processing HRTFs in order to reduce their effective order is presented. The method uses phase-correction based on ear alignment, by exploiting the dual-centering nature of HRTF measurements. In contrast to time-alignment, the phase-correction is performed parametrically, making it more robust to measurement noise. The SH order reduction and ensuing interpolation errors due to sparse sampling were analyzed for these two methods. Results indicate significant reduction in the effective SH order, where only 100 measurements and order 6 are required to achieve a normalized mean square error below $-$ 10 dB compared to a fully-sampled, high-order HRTF.

Highlights

  • S PATIAL audio interpolation plays an increasingly important role in applications such as virtual and augmented reality, spatial music, multimedia and gaming [1], [2]

  • Due to the behavior of the spherical Bessel function, which has a negligible magnitude for kr >> n [35], the “free-field head-related transfer functions (HRTFs)” can be considered to be order limited by N = kra (Ward & Abhayapala [36] showed that N = kr gives an interpolation error of around −14 dB)

  • To counteract the effect described in the previous section, and to possibly reduce the effective SH order of the HRTF, a phase-correction method based on ear alignment is suggested

Read more

Summary

INTRODUCTION

S PATIAL audio interpolation plays an increasingly important role in applications such as virtual and augmented reality, spatial music, multimedia and gaming [1], [2]. HRTFs can be considered to be order limited functions, where their order increases with frequency, and has been theoretically shown to be more than N = 40 at a frequency of 20 kHz [15] This means that at least (N + 1)2 = 1681 measurement directions are needed for accurate SH representation. A recent study by Brinkmann and Weinzierl [26] compared these methods, using a simulated HRTF, in terms of SH energy distribution and binaural models for source localization, coloration and correlation They showed that complex frequency representation of the HRTF, which seems to be the most common method [12], [15], [19], [27]–[29], requires the highest SH order and leads to the largest errors for a given SH order.

SH REPRESENTATION OF HRTFS
THE EFFECT OF DUAL-CENTERING OF HRTF MEASUREMENTS
PHASE-CORRECTION BY EAR ALIGNMENT
SPARSE SAMPLING WITH PHASE CORRECTION
MEASURES FOR EFFECTIVE SH ORDER
SIMULATION STUDY OF RIGID SPHERE AND MANIKIN HRTFS
SH Spectrum Analysis
Interpolation Error Analysis
VIII. EXPERIMENTAL INVESTIGATION OF SPARSELY SAMPLED HUMAN HRTFS
Experimental Setup
Performance Analysis With Nominal Parameters
Sensitivity Analysis For Parameter Values
Individualized Selection of Parameter Values
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call