Abstract

This paper proposes a linear predictive (LP) analysis method where sample autocorrelations are estimated from the spectral envelope of a speech signal on the basis of the spectral autocorrelation. The spectral autocorrelation is defined as discrete quantities of speech spectrum with spectral resolution identical to the discrete Fourier transform (DFT) used to obtain the speech spectrum. From analytical and empirical derivation of its properties, we can estimate the fundamental frequency and the maximally correlated frequency for voiced and unvoiced speech, respectively, and then obtain the spectral envelope by sampling at a rate of the estimated frequency. A frequency normalization can be applied to the estimated spectral envelope because the number of samples of the spectral envelope usually differs from frame to frame. The spectral envelope is warped into the mel-frequency scale and the inverse DFT is applied to extract the estimate of sample autocorrelations. From the result of LP analysis on the sample autocorrelations, we finally obtain the spectral envelope cepstral coefficients (SECC). Hidden Markov model (HMM) recognition experiments show that SECC significantly improves the performance of a recognizer at low signal-to-noise ratios (SNRs) over several other representations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call