Abstract

Immersive audio has received significant attention in the past decade. The emergence of a few groundbreaking systems and events (Dolby Atmos, MPEG-H, VR/AR, AI) contributes to reshaping the landscape of this field, accelerating the mass market adoption of immersive audio. This review serves as a quick recap of some immersive audio background, end to end workflow, covering audio capture, compression, and rendering. The technical aspects of object audio and ambisonic will be explored, as well as other related topics such as binauralization, virtual surround, and upmix. Industry trends and applications are also discussed where user experience ultimately decides the future direction of the immersive audio technologies.

Highlights

  • The past decade has witnessed a surge in immersive audio systems, ranging from professional systems in cinemas to consumer grade systems for domestic, automotive, VR/AR, and mobile platforms

  • The results show 50 cm spacing produced the most accurate localization (Fig. 15)

  • It is reported that there are more than 2000 songs in immersive audio format from big Labels such as Sony Music, Universal Music and Warner Music as well as live concerts offered by Live Nation

Read more

Summary

INTRODUCTION

The past decade has witnessed a surge in immersive audio systems, ranging from professional systems in cinemas to consumer grade systems for domestic, automotive, VR/AR, and mobile platforms. When proper audio elements are presented, the addition of height information generates a strong sense of immersion not conveniently experienced in the legacy channel systems This change is difficult to achieve if it stays in the traditional 5.1 format. Most notably by describing audio object with metadata, the mixing engineers are no longer limited by a fixed 5.1 loudspeaker arrangement when describing a sound scene This separation of content creation from the actual playback venues makes the system agnostic to vast heterogeneous rendering devices. W as the zero-order signal, represents an omnidirectional component, whereas XYZ are the figure-of directional components This WXYZ representation is often called B-Format, conveniently linking object audio to ambisonics. The sound pressure field of a surround audio signal around the origin at position (r, θ , φ) can be described by spherical harmonic function of physics by Equations (4) and (5) below.

SOUND CAPTURE
STORAGE AND TRANSMISSION OF IMMERSIVE AUDIO
RENDERING
INDUSTRY TREND
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call