Abstract

The present growth in the field of information and communication technologies has diverted the focus of many researchers towards the speech technologies. Speech technology comprises of many subfields like speech synthesis, speech recognition, speaker recognition, speech compression, speaker verification and Multimodal interaction. The basic units of the speech synthesis and speech recognition system are syllable, phoneme and word. This study mainly focuses on syllable segmentation or syllabification with the aim to further develop a speech synthesis tool in Tamil language for Human Computer Interaction [HCI]. The syllable boundaries are identified using the formant frequency, F1. The proposed syllable segmentation algorithm is applied and tested on a set of recorded continuous speech corpus. Initially, the continuous speech signal is divided into segments by removing the silence regions. The silence removal method used in this work depends on features such as signal energy and spectral centroid. After removing silence portion from the speech signals, the speech segments are further processed using Linear Predictive Coding (LPC) to extract the formant frequencies. Then the peaks in the formant frequencies are used as clue to mark the syllable boundaries in the speech. The proposed algorithm is producing an average accuracy of 89% in identifying syllable boundaries when it is compared with the hand labeled syllable boundaries.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call