Abstract
The increasing popularity of Ambisonics as a spatial audio format for streaming services poses new challenges to existing audio coding techniques. Immersive audio delivered to mobile devices requires an efficient bitrate compression that does not affect the spatial quality of the content. Good localizability of virtual sound sources is one of the key elements that must be preserved. This study was conducted to investigate the localization precision of virtual sound source presentations within Ambisonic scenes encoded with Opus low-bitrate compression at different bitrates and Ambisonic orders (1st, 3rd, and 5th). The test stimuli were reproduced over a 50-channel spherical loudspeaker configuration and binaurally using individually measured and generic Head-Related Transfer Functions (HRTFs). Participants were asked to adjust the position of a virtual acoustic pointer to match the position of virtual sound source within the bitrate-compressed Ambisonic scene. Results show that auditory localization in low-bitrate compressed Ambisonic scenes is not significantly affected by codec parameters. The key factors influencing localization are the rendering method and Ambisonic order truncation. This suggests that efficient perceptual coding might be successfully used for mobile spatial audio delivery.
Highlights
Immersive audio technology is an inevitable element of modern digital media
The purpose of the experiment presented in this paper was to subjectively assess the spatial distortion introduced by Ambisonic order truncation and perceptual coding of Ambisonic scenes using different bitrates
The results prove that localization precision is largely defined by the Ambisonic order, as higher orders present more precise spatial resolution
Summary
Immersive audio technology is an inevitable element of modern digital media. It is present in cinematic, music and installation arts, broadcast, computer games, virtual reality, and augmented reality applications. Typical use case scenarios of using immersive audio in mobile technologies require binaural playback to spatialize the sound. Gear-vr), or both are integrated (e.g. Oculus Quest, https://www.oculus.com/quest), spatial audio is delivered through headphones or miniature speakers built into the headset. The variability of listener’s perception and listening conditions contributes to the random localization error, which represents the precision of auditory localization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have