Abstract

There is much interest in anthropometric-derived head-related transfer functions (HRTFs) for simulating audio for virtual-reality systems. Three-dimensional (3D) anthropometric measures can be measured directly from individuals, or indirectly simulated from two-dimensional (2D) pinna images. The latter often requires additional pinna, head and/or torso measures. This study investigated accuracy with which 3D depth information can be obtained solely from 2D pinna images using an unsupervised monocular-depth estimation neural-network model. Output was compared to depth information obtained from corresponding magnetic resonance imaging (MRI) head scans (ground truth). Results show that 3D depth estimates obtained from 2D pinna images corresponded closely with MRI head-scan depth values.

Highlights

  • Head related transfer functions (HRTF) capture the temporal and spectral scattering of audio waves by the ears, head, and torso of a listener as it travels from an external sound source to the ear canal

  • This study investigated accuracy with which 3D depth information can be obtained solely from 2D pinna images using an unsupervised monocular-depth estimation neural-network model

  • HRTFs can be estimated based on deep neural networks using direct measurements from the individual (Bilinski et al, 2014; He et al, 2015), which are combined with 2D images of the ear (Lee and Kim, 2018)

Read more

Summary

Introduction

Head related transfer functions (HRTF) capture the temporal and spectral scattering of audio waves by the ears, head, and torso of a listener as it travels from an external sound source to the ear canal. Generic HRTFs may provide sub-optimal auditory localisation cues for the user, so alternative methods can be used to model/synthesise a close approximation to personalised HRTFS Such methods include using (a) frequency scaling (Middlebrooks and Green, 1992), (b) selection of appropriate HRTFS from a database (Barumerli et al, 2018), and (c) a linear mapping from the principle component analysis (PCA) weights applied to users’ anthropometric parameters to PCA weights applied to the HRTF spectrum (Lee and Kim, 2018; Meng et al, 2018). A more parsimonious approach would be one in which the requirement for direct anthropometric data from the individual is reduced such that a sole 2D image would suffice to provide sufficient information for selection of the appropriate HRTF

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call