An alternative representation of speech using the modified group delay feature

R.M Hegde,H.A Murthy

doi:10.1109/spcom.2004.1458402

Abstract

Representation of speech is complete when both the Fourier transform magnitude and phase spectra are used to extract features. But speech is conventionally represented by features derived from the Fourier transform magnitude spectra. In this paper, we propose an alternative representation of speech which uses the modified group delay function derived from the Fourier transform phase spectra. Cepstral features are derived from the modified group delay function. These features are called the modified group delay feature (MODGDF). Robustness of the MODGDF to convolutional and additive noise are then analyzed mathematically. Class separability and task independence of the MODGDF is then illustrated via the sequential forward search feature selection method. The results of performance evaluation of the MODGDF for four speech processing tasks phoneme recognition, syllable recognition, speaker identification, and language identification are presented. Motivated by the results of feature evaluation and performance evaluation, the MODGDF is proposed as an alternative representation of speech.

Full Text