Robust spectro-temporal features based on autoregressive models of Hilbert envelopes

Sriram Ganapathy,Samuel Thomas,Hynek Hermansky

doi:10.1109/icassp.2010.5495668

Abstract

In this paper, we present a robust spectro-temporal feature extraction technique using autoregressive models (AR) of sub-band Hilbert envelopes. AR models of Hilbert envelopes are derived using frequency domain linear prediction (FDLP). From the sub-band Hilbert envelopes, spectral features are derived by integrating these envelopes in short-term frames and the temporal features are formed by converting these envelopes into modulation frequency components. The spectral and temporal feature streams are then combined at the phoneme posterior level and are used as the input features for a recognition system. For the proposed features, robustness is achieved by using novel techniques of noise compensation and gain normalization. Phoneme recognition experiments on telephone speech in the HTIMIT database show significant performance improvements for the proposed features when compared to other robust feature techniques (average relative reduction of 10.6 % in phoneme error rate). In addition to the overall phoneme recognition rates, the performance with broad phonetic classes is also reported.

Full Text