Abstract
For a spatial audio reproduction in the context of augmented reality, a position-dynamic binaural synthesis system can be used to synthesize the ear signals for a moving listener. The goal is the fusion of the auditory perception of the virtual audio objects with the real listening environment. Such a system has several components, each of which help to enable a plausible auditory simulation. For each possible position of the listener in the room, a set of binaural room impulse responses (BRIRs) congruent with the expected auditory environment is required to avoid room divergence effects. Adequate and efficient approaches are methods to synthesize new BRIRs using very few measurements of the listening room. The required spatial resolution of the BRIR positions can be estimated by spatial auditory perception thresholds. Retrieving and processing the tracking data of the listener’s head-pose and position as well as convolving BRIRs with an audio signal needs to be done in real-time. This contribution presents work done by the authors including several technical components of such a system in detail. It shows how the single components are affected by psychoacoustics. Furthermore, the paper also discusses the perceptive effect by means of listening tests demonstrating the appropriateness of the approaches.
Highlights
Immersive reproduction of spatial audio in a way that both artificial and real audio objects are perceived as plausible audible events in a virtual and/or augmented environment is something researchers have tried for many years
We describe a basic scheme of an auditory augmented reality (AAR) system built from several different functional blocks which realize a position-dynamic binaural synthesis
Methods for binaural room impulse responses (BRIRs) synthesis were presented that create new listening positions in a room based on very few measurements
Summary
Immersive reproduction of spatial audio in a way that both artificial and real audio objects are perceived as plausible audible events in a virtual and/or augmented environment is something researchers have tried for many years. The room impulse response is decomposed into modifiable parameters which can be changed depending on the listener movement This allows an efficient adaptation of filter coefficients, but the success relies on the quality of the model. Filter shaping approaches rely on a BRIR measurement on one position, and only certain properties of filters are adapted when the listener moves such as energy decay curve (EDC), level of direct sound, or initial time delay gap (ITDG) [15,16,17] Often these changes are empirically determined, but they can be estimated by simple models (such as inverse square law). The filters have to be changed each time the position or the pose of the listener changes As it has become clear, the plausible creation of virtual audio objects fused into a real room is the main challenge for AAR.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.