Use of spectral autocorrelation in spectral envelope linear prediction for speech recognition

Hong Kook Kim Hong Kook Kim,Hwang Soo Lee Hwang Soo Lee

doi:10.1109/89.784105

Abstract

This paper proposes a linear predictive (LP) analysis method where sample autocorrelations are estimated from the spectral envelope of a speech signal on the basis of the spectral autocorrelation. The spectral autocorrelation is defined as discrete quantities of speech spectrum with spectral resolution identical to the discrete Fourier transform (DFT) used to obtain the speech spectrum. From analytical and empirical derivation of its properties, we can estimate the fundamental frequency and the maximally correlated frequency for voiced and unvoiced speech, respectively, and then obtain the spectral envelope by sampling at a rate of the estimated frequency. A frequency normalization can be applied to the estimated spectral envelope because the number of samples of the spectral envelope usually differs from frame to frame. The spectral envelope is warped into the mel-frequency scale and the inverse DFT is applied to extract the estimate of sample autocorrelations. From the result of LP analysis on the sample autocorrelations, we finally obtain the spectral envelope cepstral coefficients (SECC). Hidden Markov model (HMM) recognition experiments show that SECC significantly improves the performance of a recognizer at low signal-to-noise ratios (SNRs) over several other representations.

Full Text