Low-delay Coder Research Articles

A low delay coder for speech and music signals sampled at 32kHz is described. Its algorithmic delay does not exceed 25 ms which enables audioconferencing applications without echo cancellation. Its bit rate is scalable between 64 and 32 kbit/s by steps of 8 kbit/s. The transmitter issues the binary code at 64 kbit/s with lower bit rate codes embedded in it. The receiver may operate at lower bit rates with gradual loss of quality. The proposed coder is based on a mixed scheme : the adopted solution contains elements from the CELP speech coder and frequency domain music coders. The perceptual signal is obtained in the time domain, then transformed to the frequency domain where bit allocation is calculated and transform coefficients are quantized. A first solution based on the dft is discussed, then a second solution based on a mdct with small overlap is applied. The quantization of these coefficients is done in the following way. First, a prediction of the whole spectrum is applied. Then, a mean- removed gain- shape split vq is used for amplitude spectrum quantization and a hierarchical 2- dimensional vq is used for phase spectrum quantization with amplitude correction. At the phase quantization stage, each codeword describing the selected vector index is split into parts corresponding to different bit rates. Due to the hierarchical codebook structure, truncated indices may be used, without much affecting the signal quality. Simulation results are presented and the robustness of the proposed coder is examined.

Read full abstract

We study the application of wavelet packet filterbanks to low bit-rate transparent audio coding, taking the audio coders' delay requirements into account, and propose low-delay coders based on wavelet packet filterbanks. We first develop a method of comparison between filterbanks for perceptual audio coding by estimating the necessary bit-rate for a transparent compression. We use this comparison method in order to select the best filters for our audio compression scheme, from a large set of orthogonal and biorthogonal wavelets. Different wavelet filters may be used at different stages of the tree-structured decomposition with a constraint on the overall delay taken into account. The optimization is carried out with a simulated annealing procedure, proposing two wavelet packet filterbanks, exhibiting average and low delays. They are inserted in a complete audio coder that employs vector quantization and considers psychoacoustic models. The use of the proposed filterbanks leads to the design of a new bit allocation procedure, taking into account the lack of selectivity of the equivalent synthesis filters in a wavelet packet filterbank. The resulting audio scheme is validated through listening tests. The wavelet packet filterbanks are shown to be a promising tool for audio coding, especially for low-delay coding: with average delay, the quality of the wavelet packet filterbanks is as good as with MPEG-1 Layer-2, both with 80 kb/s, and when reducing the delay to 200 samples, 96 kb/s are needed to achieve the same quality.

Read full abstract

Low-delay Coder Research Articles

Related Topics

Articles published on Low-delay Coder

Improved audio coding using a psychoacoustic model based on a cochlear filter bank

Low delay coder (< 25 ms) of wideband audio (20 Hz-15 kHz) scalable from 64 to 32 kbit/s

Wavelet packet filterbanks for low time delay audio coding

Low bit-rate CELP speech coder with low delay

Adaptive code excited predictive coding

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Low-delay Coder Research Articles

Related Topics

Articles published on Low-delay Coder

Improved audio coding using a psychoacoustic model based on a cochlear filter bank

Low delay coder (&lt; 25 ms) of wideband audio (20 Hz-15 kHz) scalable from 64 to 32 kbit/s

Wavelet packet filterbanks for low time delay audio coding

Low bit-rate CELP speech coder with low delay

Adaptive code excited predictive coding

Low delay coder (< 25 ms) of wideband audio (20 Hz-15 kHz) scalable from 64 to 32 kbit/s