Abstract

The focus of this paper is two-fold: (a) to develop a knowledge-based robust syllable segmentation algorithm and (b) to establish the importance of accurate segmentation in both the training and testing phases of a speech recognition system. A robust segmentation algorithm for segmenting the speech signal into syllables is first developed. This uses a non-statistical technique that is based on group delay (GD) segmentation and Vowel Onset point (VOP) detection. The transcription corresponding to the utterance is syllabified using rules. This produces an annotation for the train data. The annotated train data is then used to train a syllable-based speech recognition system. The test signal is also segmented using the proposed algorithm. This segmentation information is then incorporated into the linguistic search space to reduce both computational complexity and word error rate (WER). WER's of 4.4% and 21.2% are reported on the TIMIT and NTIMIT databases respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call