Open vocabulary spoken content retrieval by front-ending with spoken term detection

Tomoko Takigami,Tomoyosi Akiba

doi:10.1109/apsipa.2013.6694130

Abstract

How to deal with speech recognition errors and out-of-vocabulary (OOV) words are common challenging problems in spoken document processing. In this work, we propose the spoken content retrieval (SCR) method that incorporates spoken term detection (STD) as the pre-processing stage. The proposed method firstly performs STD for each term appearing in the given query topic, then the detection results are used to calculate the relevance of the retrieved document to the topic. The front-ending with STD enables to make use of even misrecognized and OOV words as the clues of the back-end document retrieval process. We also propose a novel retrieval model especially designed for the proposed SCR method. It incorporates the term co-occurrences into the conventional vector space model in order to put emphasis on reliable clues for the similarity calculation, which enables the retrieval process to work robust for documents including errors. The experimental results showed that the performance of the proposed SCR method improved the retrieval performance when a query topic included OOV words, even though it relied on the lower-accuracy syllable-based ASR results. They also showed that the proposed retrieval model significantly improved the retrieval accuracy not only for the proposed SCR but also for the conventional SCR methods.

Full Text