Indonesian graphemic syllabification using a nearest neighbour classifier and recovery procedure

Edwina Anky Parande,Suyanto Suyanto

doi:10.1007/s10772-018-09569-3

Abstract

An automatic syllabification, decomposing a word into syllables, is an important part in an automatic speech recognition (ASR) that uses both syllable-based acoustic and language models. It can be performed to either phoneme or grapheme sequences. The phonemic syllabification is more complex than the other since it requires a grapheme-to-phoneme conversion (G2P) as a previous process. It generally gives a high accuracy for many formal words but its accuracy may decrease for person-names. In contrast, the graphemic syllabification is simpler and more potential to be applied for person-names. This research focuses on developing a model of graphemic syllabification using a combination of phonotactic rules and Fuzzy k-nearest neighbour in every Class (FkNNC). The phonotactic rules are designed to find some deterministic syllabification points while FkNNC, as a statistical classifier, is expected to search the remaining stochastic syllabification points. A recovery procedure is proposed to correct the wrong syllabification points produced by FkNNC. Fivefold cross-validating on a dataset of 50k formal words, selected from the great dictionary of the Indonesian language, shows that the proposed model gives syllable error rate (SER) of 2.48% and the proposed recovery procedure reduces the SER to be 2.27%, which is higher than that produced by the phonemic syllabification (only 0.99%). But, this model is capable of handling a dataset of 15k high variance person-names with SER of 7.45% and the proposed recovery procedure reduces the SER to be 6.78%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Indonesian graphemic syllabification using a nearest neighbour classifier and recovery procedure

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology

Lead the way for us

Journal: International Journal of Speech Technology	Publication Date: Nov 8, 2018
Citations: 22

Similar Papers

A Myanmar large vocabulary continuous speech recognition system
Hay Mar Soe Naing ... Xinhui Hu
-
Hay Mar Soe Naing, et. al.Hay Mar Soe Naing ... Xinhui Hu
01 Dec 2015
01 Dec 2015

IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTIONAL NEURAL NETWORK PARAMETERS

Zenodo (CERN European Organization for Nuclear Research) | VOL. -

01 Dec 2018
Zenodo (CERN European Organization for Nuclear Research) | VOL. -

Improving Myanmar Automatic Speech Recognition with Optimization of Convolutional Neural Network Parameters
Aye Nyein Mon ... Ye Kyaw Thu
International Journal on Natural Language Computing | VOL. 7
Aye Nyein Mon, et. al.Aye Nyein Mon ... Ye Kyaw Thu
31 Dec 2019
International Journal on Natural Language Computing | VOL. 7

Interval valued fuzzy sets k-nearest neighbors classifier for finger vein recognition
Nordiana Mukahar ... Bakhtiar Affendi Rosdi
Journal of Physics: Conference Series | VOL. 890
Nordiana Mukahar, et. al.Nordiana Mukahar ... Bakhtiar Affendi Rosdi
01 Sep 2017
Journal of Physics: Conference Series | VOL. 890

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Indonesian graphemic syllabification using a nearest neighbour classifier and recovery procedure

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology