Abstract

The 3D Tune-In Toolkit (3DTI Toolkit) is an open-source standard C++ library which includes a binaural spatialiser. This paper presents the technical details of this renderer, outlining its architecture and describing the processes implemented in each of its components. In order to put this description into context, the basic concepts behind binaural spatialisation are reviewed through a chronology of research milestones in the field in the last 40 years. The 3DTI Toolkit renders the anechoic signal path by convolving sound sources with Head Related Impulse Responses (HRIRs), obtained by interpolating those extracted from a set that can be loaded from any file in a standard audio format. Interaural time differences are managed separately, in order to be able to customise the rendering according the head size of the listener, and to reduce comb-filtering when interpolating between different HRIRs. In addition, geometrical and frequency-dependent corrections for simulating near-field sources are included. Reverberation is computed separately using a virtual loudspeakers Ambisonic approach and convolution with Binaural Room Impulse Responses (BRIRs). In all these processes, special care has been put in avoiding audible artefacts produced by changes in gains and audio filters due to the movements of sources and of the listener. The 3DTI Toolkit performance, as well as some other relevant metrics such as non-linear distortion, are assessed and presented, followed by a comparison between the features offered by the 3DTI Toolkit and those found in other currently available open- and closed-source binaural renderers.

Highlights

  • Binaural literally means relating or involving both ears

  • As can be seen, using the HRTF with initial delay (ITD) produced some important colouration due to comb filtering effect, causing additional notches to appear in frequencies between 2 kHz and 6 kHz and from 10 kHz to 17 kHz

  • We can say that Head Related Impulse Responses (HRIRs) generated interpolating without ITD is more similar to the one of the original HRIR from the database

Read more

Summary

Introduction

Binaural literally means relating or involving both ears. Binaural hearing refers to the ability of the auditory system to analyse the sound at the two ears, integrate the information embedded in the acoustic stimuli, and perceive sound as coming from a three-dimensional space. HRTFs measured at a given distance are modified in order to simulate sound sources located in closer or farther locations Another available approach, different from the one implemented in the 3DTI Toolkit, is to use databases where HRTFs have been measured at different distances from the listener [67], adding a further dimension to the HRIR interpolation process. We refer to this process as an HRIR correction because it is applied in series with the HRIR selected and interpolated in the previous stages In both cases (HRTF convolution and high-performance mode), a problem arises when the source or the listener are moving, as the near-field correction filters have to change from frame to frame. This allows the test application to be used as an audio rendered, fully and remotely controlled by other applications, such as VR visual renderers, motion tracking systems, etc

Evaluation
Evaluation of the HRIR interpolation technique
Discussion and comparison with existing tools
Conclusions
71. AES69-2015
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call