A dynamic term discovery strategy for automatic speech recognizers with evolving dictionaries

Alejandro Coucheiro-Limeres,Javier Ferreiros-López,Fernando Fernández-Martínez,Ricardo Córdoba

doi:10.1016/j.eswa.2021.114860

Alejandro Coucheiro-Limeres, Javier Ferreiros-López + Show 2 more

Open Access

PDF Available

https://doi.org/10.1016/j.eswa.2021.114860

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

We present a dynamic term discovery (TD) strategy that is capable of automatically adapting the dictionaries managed by ASR systems to the input speech, in terms of lexicon and language model (LM). The adaptation tries to solve the problem of out-of-vocabulary (OOV) words that are likely to appear in most realistic scenarios and uses external knowledge sources for extending the capabilities of the LMs present in the systems. The handling of the OOV words is made by existing TD strategies that are able to detect and solve OOVs, plus special word selection processes that decide which words are to be added or deleted, so as to update the vocabulary constantly. We also propose a mathematical model for controlling the vocabulary size of the ASR system as well as the word addition and deletion rates that are involved. Then, the update of the overall LM is based on an interpolation scheme with smaller LMs built with external language knowledge that depends on the current speech and the words to be added at each time. We designed a realistic experimental framework for evaluating the strategy, employing ASR systems with moderated vocabulary sizes and a couple of test speech corpora with very distinct features. The results show that the dynamic TD strategy is able to offer a general positive tendency in WER improvement over systems without it, being able indeed to reach a significant difference after few hours of speech processing.

Full Text