Abstract

Robust syllabification of continuous speech is a vital aspect of language and speech processing systems. Syllabification of speech can be done by detecting the syllable nuclei. Syllable is the basic production unit of human speech and syllable nuclei can be attributed to high energy sonarants or resonant sounds which are relatively loud and carry a clear pitch. In this work, high spectral energy at formants in the glottal closure phase are explored for improving the performance of the syllable nuclei detection in continuous speech. The spectral energy at formants is extracted by using the group delay method, and glottal closure instants are located using the zero frequency filter based method. Performance of the proposed syllable nuclei detection method is analysed in both clean and noisy environments using TIMIT and NTIMIT databases respectively. Significant improvement in the performance of the syllable nuclei detection is observed using the proposed method compared to the existing state of the art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call