Method of encoding speech signals involving the extraction of speech formant candidates in real time

Perigaram K Rajasekaran,George R Doddington

doi:10.1121/1.404102

Abstract

Method of encoding speech signals which is based upon determining the roots of the linear prediction polynomial describing the spectrum of an analog speech signal, wherein the roots are candidates in determining the formants of the speech signal. The method involves the analysis of respective frames of sampled digital speech data using a linear predictive technique to determine a set of reflection coefficients or K-parameters which are then converted into the equivalent predictor coefficients or A-parameters describing a prediction polynomial having a plurality of roots corresponding to the poles of an all-pole filter characterizing the vocal tract. A modified Bairstow technique is then empolyed for factoring out quadratic factors which are then sorted in an ordered arrangement in terms of ascending bandwidths. In performing the modified Bairstow technique, initial estimates of the successive quadratic factors for a current frame of digital speech data are made in sequence, and the prediction polynomial is successively deflated to a reduced order polynomial in determining the respective quadratic factors thereof. The initial estimate of the first quadratic factor is the same as the smallest bandwidth root as determined from the previous frame of digital speech data. These removed quadratic factors or roots are candidates for determining the formants of the speech signal.

Full Text