Abstract

Six-Degree-of-Freedom (6DoF) audio rendering interactively synthesizes spatial audio signals for a variable listener perspective based on surround recordings taken at multiple perspectives distributed across the listening area in the acoustic scene. Methods that rely on recording-implicit directional information and interpolate the listener perspective without the attempt of localizing and extracting sounds often yield high audio quality, but are limited in spatial definition. Methods that perform sound localization, extraction, and rendering typically operate in the time-frequency domain and risk introducing artifacts such as musical noise. We propose to take advantage of the rich spatial information recorded in the broadband time-domain signals of the multitude of distributed first-order (B-format) recording perspectives. Broadband time-variant signal extraction retrieving direct signals and leaving residuals to approximate diffuse and spacious sounds is less of a quality risk, and likewise is the broadband re-encoding to enhance spatial definition of both signal types. To detect and track direct sound objects in this process, we combine the directional data recorded at the single perspectives into a volumetric multi-perspective activity map for particle-filter tracking. Our technical and perceptual evaluation confirms that this kind of processing enhances the otherwise limited spatial definition of direct-sound objects of other broadband but signal-independent virtual loudspeaker object (VLO) or Vector-Based Intensity Panning (VBIP) interpolation approaches.

Highlights

  • The interactive rendering of recorded auditory scenes as virtual listening environments requires an approach to allow six Degrees of Freedom (6DoF) of movement for a variable listener perspective

  • virtual loudspeaker object (VLO) shows ratings seemingly depending on the listener position and the significant increase in rating at position 4 is due to the fact that this position is a direct microphone array position and far away from a source

  • There, the surround perspective of the microphone position provides accurate reproduction just as with Vector-Based Intensity Panning (VBIP), providing a better room impression owed to the rich diversity of VLOs and their directions

Read more

Summary

Introduction

The interactive rendering of recorded auditory scenes as virtual listening environments requires an approach to allow six Degrees of Freedom (6DoF) of movement for a variable listener perspective. The variable-perspective rendering of auditory scenes requires interpolation between static recording perspective positions. In existing research, this concept is often referred to as scene navigation or scene walk-through. This concept is often referred to as scene navigation or scene walk-through This contribution mainly refers to firstorder tetrahedral microphone arrays as means for recording surround audio for high fidelity applications. While volumetrically navigable 6DoF recording and rendering are theoretically feasible, practical distributions of multiple static 3D audio recordings typically consider capturing perspective changes along the horizontal dimensions to enable walkable rendering of the auditory scene. Perspective extrapolation of a single perspective for a shifted listening position has been considered in the SpaMoS (spatially modified synthesis) method by Pihlajamäki and Pulkki [1, 2] that estimates time-frequency-domain source positions by projecting directional signal detections of DirAC (directional audio coding [3, 4]) onto a pre-defined

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.