Abstract

For a spatial audio reproduction in the context of augmented reality, a position-dynamic binaural synthesis system can be used to synthesize the ear signals for a moving listener. The goal is the fusion of the auditory perception of the virtual audio objects with the real listening environment. Such a system has several components, each of which help to enable a plausible auditory simulation. For each possible position of the listener in the room, a set of binaural room impulse responses (BRIRs) congruent with the expected auditory environment is required to avoid room divergence effects. Adequate and efficient approaches are methods to synthesize new BRIRs using very few measurements of the listening room. The required spatial resolution of the BRIR positions can be estimated by spatial auditory perception thresholds. Retrieving and processing the tracking data of the listener’s head-pose and position as well as convolving BRIRs with an audio signal needs to be done in real-time. This contribution presents work done by the authors including several technical components of such a system in detail. It shows how the single components are affected by psychoacoustics. Furthermore, the paper also discusses the perceptive effect by means of listening tests demonstrating the appropriateness of the approaches.

Highlights

  • Immersive reproduction of spatial audio in a way that both artificial and real audio objects are perceived as plausible audible events in a virtual and/or augmented environment is something researchers have tried for many years

  • We describe a basic scheme of an auditory augmented reality (AAR) system built from several different functional blocks which realize a position-dynamic binaural synthesis

  • Methods for binaural room impulse responses (BRIRs) synthesis were presented that create new listening positions in a room based on very few measurements

Read more

Summary

Introduction

Immersive reproduction of spatial audio in a way that both artificial and real audio objects are perceived as plausible audible events in a virtual and/or augmented environment is something researchers have tried for many years. The room impulse response is decomposed into modifiable parameters which can be changed depending on the listener movement This allows an efficient adaptation of filter coefficients, but the success relies on the quality of the model. Filter shaping approaches rely on a BRIR measurement on one position, and only certain properties of filters are adapted when the listener moves such as energy decay curve (EDC), level of direct sound, or initial time delay gap (ITDG) [15,16,17] Often these changes are empirically determined, but they can be estimated by simple models (such as inverse square law). The filters have to be changed each time the position or the pose of the listener changes As it has become clear, the plausible creation of virtual audio objects fused into a real room is the main challenge for AAR.

Proposal of a Position-Dynamic Binaural Synthesis System
Spatial Sub-Sampling
BRIR Synthesis
Constant Reverberation
Acoustical Shaping
Sound Source Directivity
Real-Time Processing
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.