Abstract

For large vocabulary continuous speech recognition (LVCSR), selection of appropriate lexical unit is the first important step. When the word unit is selected as the lexicon, word boundary detection problem can be avoided. But selection of lexicon is not clear for the derivative morphological structure (e.g. agglutinative languages), and there is no word boundary for many languages (Chinese, Japanese, etc.). This paper, based on the Uyghur LVCSR system, analyze multi-layered lexicon based automatic speech recognition (ASR) systems, compare the ASR results of various linguistic layers, propose a new method which can balance the advantages of two layers of lexicons. By aligning and comparing the ASR results of two layers, we analyze error patterns, extract samples as the training data for the alternative selection method. Experimental results show that the proposed method effectively improved the ASR accuracy while maintaining small lexicon size.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call