A Neural Document Language Modeling Framework for Spoken Document Retrieval

Li-Phen Yen,Zhen-Yu Wu,Kuan-Yu Chen

doi:10.1109/icassp40776.2020.9054066

Abstract

Recent developments in deep learning have led to a significant innovation in various classic and practical subjects, including speech recognition, computer vision, question answering, information retrieval and so on. In the context of natural language processing (NLP), language representations learned by referring to autoregressive language modeling or autoencoding have shown giant successes in many downstream tasks, so the school of studies have become a major stream of research recently. Because the immenseness of multimedia data along with speech have spread around the world in our daily life, spoken document retrieval (SDR), which aims at retrieving relevant multimedia contents to satisfy users’ queries, has become an important research subject in the past decades. Targeting on enhancing the SDR performance, the paper concentrates on proposing a neural retrieval framework, which assembles the merits of using language modeling (LM) mechanism in SDR and leveraging the abstractive information learned by the language representation models. Consequently, to our knowledge, this is a pioneer study on supervised training of a neural LM-based SDR framework, especially combined with the pretrained language representation methods. A series of empirical SDR experiments conducted on a benchmark collection demonstrate the good efficacy of the proposed framework, compared to several existing strong baseline systems.

Full Text