Abstract

Epoch extraction from speech involves the suppression of vocal tract resonances, either by linear prediction based inverse filtering or filtering at very low frequency. Degradations due to channel effect and significant attenuation of low frequency components ( $ 300 Hz) create challenges for the epoch extraction from telephone quality speech. An epoch extraction method is proposed that considers the vertical striations present in the time-frequency representation of voiced speech as the representative candidates for the epochs. Time-frequency representation with better localized vertical striations is estimated using single pole filter based filter bank. The time marginal of time-frequency representation is computed to locate the epochs. The proposed algorithm is evaluated on the database of five speakers, which provide simultaneous speech and electroglottographic recordings. Telephone quality speech is simulated using G.191 software tools. The identification rate of the state-of-the-art methods degrades substantially for the telephone quality speech whereas that of the proposed method remains the same, comparable to that of clean speech.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call