Variable Rate Speech Coding Research Articles

Phonetic classification of speech frames allows distinctive quantization and bit allocation schemes suited to the particular class. Separate quantization of the linear predictive coding (LPC) parameters for voiced and unvoiced speech frames is shown to offer useful gains for representing the synthesis filter commonly used in code-excited linear prediction (CELP) and other coders. Subjective test results are reported that determine the required bit rate and accuracy in the two classes of voiced and unvoiced LPC spectra for CELP coding with phonetic classification. It was found, in this context, that unvoiced spectra need 9 b/frame or more whereas voiced spectra need 25 b/frame or more with the quantization schemes used. New spectral distortion criteria needed to assure transparent LPC spectral quantization for each voicing class in CELP coders are presented. Similar subjective test results for speech synthesized from the true residual signal are also presented, leading to some interesting observations on the role of the analysis-by-synthesis structure of CELP. Objective performance assessments based on the spectral distortion measure are also presented. The theoretical distortion-rate function for the spectral distortion measure is estimated for voiced and unvoiced LPC parameters and compared with experimental results obtained with unstructured vector quantization (VQ). These results show a saving of at least 2 b/frame for unvoiced spectra compared to voiced spectra to achieve the same spectral distortion performance.

Read full abstract

The performance of a variable rate code excited linear predictor system is investigated. The coding system is based on a finite state CELP (FSCELP) frame work. Each individual state is primarily identified with a LPC model order, LPC coefficients bit allocation, excitation code book population density and state encoding rate. Successive input speech vectors are encoded at a rate that depends on the current state of the FSCELP system and the input vector characteristics. The use of a finite state system involves implicit clustering of speech signals. The lower rate states are selected during highly correlated steady state speech segments when relatively few bits are required to obtain adequate fidelity. For speech signals with a strong glottal excitation, unvoiced signals and transient speech segments, a relatively greater quantisation accuracy is needed to obtain good fidelity and therefore higher rate states of the system are used. Further improvement is obtained by using gamma populated excitation codebooks, for those states that are mainly used to encode speech signals with a strong underlying glottal excitation pulses. Experiments focus on investigation of the varying encoding requirements of the excitation signal for low pass, voiced, unvoiced and transient speech signals. The parameters of the finite state CELP system are designed to match the encoding requirements of typical speech signals. The greater part of the coding gain is obtained from variable rate encoding of the excitation signal. Using a six-state FSCELP, good quality speech is obtained at an average, maximum and minimum bit rates of 4 kbit/s, 10 kbit/s and 2 kbit/s, respectively.

Read full abstract

Variable Rate Speech Coding Research Articles

Articles published on Variable Rate Speech Coding

The Voice Activity Detection Algorithm Based on Spectral Entropy and High-Order Statistics

A Statistical Approach for Voiced Speech Detection

KLT-based adaptive entropy-constrained vector quantization for the speech signals

Limited error based event localizing temporal decomposition and its application to variable-rate speech coding

Pitch adaptive windows for improved excitation coding in low-rate CELP coders

A statistical model-based voice activity detection

Voicing-specific LPC quantization for variable-rate speech coding

Capacity enhancement of cellular CDMA by traffic-based control of speech bit rate

Finite state CELP for variable rate speech coding

Performance of packet reservation multiple access for digital mobile radio using variable rate speech coder

Real-time speech segmentation using pitch and convexity jump models: application to variable rate speech coding

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Variable Rate Speech Coding Research Articles

Articles published on Variable Rate Speech Coding

The Voice Activity Detection Algorithm Based on Spectral Entropy and High-Order Statistics

A Statistical Approach for Voiced Speech Detection

KLT-based adaptive entropy-constrained vector quantization for the speech signals

Limited error based event localizing temporal decomposition and its application to variable-rate speech coding

Pitch adaptive windows for improved excitation coding in low-rate CELP coders

A statistical model-based voice activity detection

Voicing-specific LPC quantization for variable-rate speech coding

Capacity enhancement of cellular CDMA by traffic-based control of speech bit rate

Finite state CELP for variable rate speech coding

Performance of packet reservation multiple access for digital mobile radio using variable rate speech coder

Real-time speech segmentation using pitch and convexity jump models: application to variable rate speech coding