Dual-Space Re-ranking Model for Efficient Document Retrieval, User Modeling and Adaptation

Jan Stas,Jozef Juhar,Martin Lojka,Daniel Hladek

doi:10.23919/elmar.2018.8534677

Abstract

The increasing demand for the performance improvement and robustness of automatic transcription of spontaneous speech in Slovak forces us to look for the advanced methods of adaptation of acoustic and language models to the user-specific voice characteristics and the topic of their speech. One of the ways how to increase the domain robustness of language models is to improve the process of retrieving text documents relevant to the current topic of the speech and use them to adapt the existing background language model. This paper focuses on the analysis, design and implementation of a new dual-space re-ranking model for document retrieval, adaptation of language models to the current topic of speech and personalization of speech recognition system. The experimental results of the proposed dual-space reranking model based on the averaging coefficients produced by latent semantic indexing and paragraph vectors ranking models show an additional 1% relative improvement in word error rate against the efficiency of single-space model ranking.

Full Text