Abstract

In multichannel spatial audio coding (SAC), the accurate representations of virtual sounds and the efficient compressions of spatial parameters are the key to perfect reproduction of spatial sound effects in 3D space. Just noticeable difference (JND) characteristics of human auditory system can be used to efficiently remove spatial perceptual redundancy in the quantization of spatial parameters. However, the quantization step sizes of spatial parameters in current SAC methods are not well correlated with the JND characteristics. It results in either spatial perceptual distortion or inefficient compression. A JND-based spatial parameter quantization (JSPQ) method is proposed in this paper. The quantization step sizes of spatial parameters are assigned according to JND values of azimuths in a full circle. The quantization codebook size of JSPQ was 56.7 % lower than one of the quantization codebooks of MPEG surround. Average bit rate reduction on spatial parameters for standard 5.1-channel signals reached up to approximately 13 % compared with MPEG surround, while preserving comparable subjective spatial quality.

Highlights

  • Along with the trend towards high-quality audio, audio systems have evolved through mono, stereo, to multichannel audio systems

  • 6 Conclusions In spatial audio coding, the spatial information of virtual sounds generated by loudspeakers are extracted as spatial parameters, as well as downmix signals are obtained from loudspeaker signals

  • In the case of multichannel loudspeaker systems, the virtual sounds may be widely distributed in 3D space

Read more

Summary

Introduction

Along with the trend towards high-quality audio, audio systems have evolved through mono, stereo, to multichannel audio systems. The spatial perceptual features of binaural cues (such as JND of ILD) can be used as a reference in the quantization of spatial parameters (such as ICLD) to remove perceptual redundancy These spatial hearing characteristics of human auditory system were used in the first SAC framework Binaural Cue Coding (BCC) to reconstruct spatial effect in stereo signal with a bit rate of only 2 kbps for spatial parameters [13, 14]. The main contributions and works include the following: to accurately represent the virtual sound, a method to estimate spatial parameter azimuth and the signal of virtual sound from an arbitrary number of loudspeakers was proposed; an azimuthal JND based spatial parameters quantization method (JSPQ) was proposed; and the generation procedure of azimuth quantization codebook was elaborated in details. Objective experiments and subjective evaluation were conducted to confirm that the proposed JSPQ outperformed reference quantization methods of spatial parameters in the respects of codebook sizes, quantization errors, coding bit rates, and spatial qualities

Spatial audio coding
Proposed quantization methods of spatial parameters
Quantization of spatial parameter ICLD
Methods
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call