Abstract

An automatic syllabification, decomposing a word into syllables, is an important part in an automatic speech recognition (ASR) that uses both syllable-based acoustic and language models. It can be performed to either phoneme or grapheme sequences. The phonemic syllabification is more complex than the other since it requires a grapheme-to-phoneme conversion (G2P) as a previous process. It generally gives a high accuracy for many formal words but its accuracy may decrease for person-names. In contrast, the graphemic syllabification is simpler and more potential to be applied for person-names. This research focuses on developing a model of graphemic syllabification using a combination of phonotactic rules and Fuzzy k-nearest neighbour in every Class (FkNNC). The phonotactic rules are designed to find some deterministic syllabification points while FkNNC, as a statistical classifier, is expected to search the remaining stochastic syllabification points. A recovery procedure is proposed to correct the wrong syllabification points produced by FkNNC. Fivefold cross-validating on a dataset of 50k formal words, selected from the great dictionary of the Indonesian language, shows that the proposed model gives syllable error rate (SER) of 2.48% and the proposed recovery procedure reduces the SER to be 2.27%, which is higher than that produced by the phonemic syllabification (only 0.99%). But, this model is capable of handling a dataset of 15k high variance person-names with SER of 7.45% and the proposed recovery procedure reduces the SER to be 6.78%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call