Abstract

Split vector quantization (SVQ) performs well and efficiently for line spectral frequency (LSF) quantization, but misses some component dependencies. Switched SVQ (SSVQ) can restore some advantage due to nonlinear dependencies through Gaussian Mixture Models (GMM). Remaining linear dependencies or correlations between vector components can be used to advantage by transform coding. The Karhunen-Loeve transform (KLT) is normally used but eigendecomposition and full transform matrices make it computationally complex. However, a family of transforms has been recently characterized by the capability of generalized triangular decomposition (GTD) of the source covariance matrix. The prediction-based lower triangular transform (PLT) is the least complex of such transforms and is a component in the implementation of all of them. This paper proposes a minimum noise structure for PLT SVQ. Coding results for 16-dimensional LSF vectors from wideband speech show that GMM PLT SSVQ can achieve transparent quantization down to 41 bit/frame with distortion performance close to GMM KLT SSVQ at about three-fourths as much operational complexity. Other members of the GTD family, such as the geometric mean decomposition (GMD) transform and the bidiagonal (BID) transform, fail to capitalize on their advantageous features due to the low bit rate per component in the range tested.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call