Multilingual spoken term detection: a review

G Deekshitha,Leena Mary

doi:10.1007/s10772-020-09732-9

Abstract

In modern multilingual societies, there is a demand for multilingual Automatic Speech Recognition (ASR) and Spoken Term Detection (STD). Multilingual Spoken Term Detection refers to the process of retrieving appropriate audio files from a vast multilingual database using audio queries. This paper presents an overview of various efforts on multilingual spoken term detection, even for low resourced languages. A detailed discussion on different methodologies, along with a comparison, has been made. Various approaches for multilingual STD are organized based on feature representations, tokenization techniques, matching techniques and availability of resources. Different languages and corresponding datasets employed for the task of multilingual STD have been listed for quick referencing. A discussion of different benchmarking platforms for multilingual STD has also been included. The paper aims to provide a quick overview of different techniques and datasets widely used in multilingual STD research.

Full Text