Abstract

Representation of speech is complete when both the Fourier transform magnitude and phase spectra are used to extract features. But speech is conventionally represented by features derived from the Fourier transform magnitude spectra. In this paper, we propose an alternative representation of speech which uses the modified group delay function derived from the Fourier transform phase spectra. Cepstral features are derived from the modified group delay function. These features are called the modified group delay feature (MODGDF). Robustness of the MODGDF to convolutional and additive noise are then analyzed mathematically. Class separability and task independence of the MODGDF is then illustrated via the sequential forward search feature selection method. The results of performance evaluation of the MODGDF for four speech processing tasks phoneme recognition, syllable recognition, speaker identification, and language identification are presented. Motivated by the results of feature evaluation and performance evaluation, the MODGDF is proposed as an alternative representation of speech.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.