Robust/fast out-of-vocabulary spoken term detection by N-gram index with exact distance through text/speech input

Nagisa Sakamoto,Seiichi Nakagawa

doi:10.1109/apsipa.2013.6694366

Abstract

For spoken term detection, it is very important to consider Out-of-Vocabulary (OOV). Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a very fast Japanese spoken term detection system that is robust for considering OOV words. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram index of syllables in a recognized syllable-based lattice. We proposed an n-gram indexing/retrieval method in the syllable lattice for attacking OOV and high speed retrieval. Specially, in this paper, we redefineded the distance of the n-gram and used trigram, bigram and unigram that instead of using only trigram to calculate the exact distance. In our experiments, where using text and speech query, we achieved to improve the retrieval performance.

Full Text