A sinusoidal voice over packet coder tailored for the frame-erasure channel

J Lindblom

doi:10.1109/tsa.2005.851913

Abstract

A speech coder tailored especially for the frame-erasure channel-the sinusoidal voice over packet coder (SVOPC)-is proposed. Based on a classified approach, avoiding interframe coding techniques, and synthesizing its output from slowly varying parameters, the coder is inherently robust to packet loss. SVOPC is based on quasi-harmonic modeling of the linear prediction (LP) residual. Both the sinusoidal amplitudes and phases are explicitly encoded using new methods based on Gaussian mixture models. A wide-band (16-kHz sampling frequency) implementation of the coder provides synthesized speech of good subjective quality at around 20 kbps. SVOPC is evaluated by means of subjective listening tests, and compared to a reference system based on G.722.2 (the AMR wide-band codec). Under frame erasure conditions (5%-30% frame erasures generated according to a Gilbert model), SVOPC clearly outperforms G.722.2.

Full Text