Accelerating large vocabulary continuous speech recognition on heterogeneous CPU-GPU platforms

Jungsuk Kim,Ian Lane

doi:10.1109/icassp.2014.6854209

Abstract

While prior works have demonstrated the effectiveness of GraphicProcessing Units (GPUs) for limited vocabulary speech recognition, these methods were unsuitable for recognition with large language models. To overcome this limitation, previously we introduced a novel “on-the-fly rescoring” approach in which search was performed over a WFST-network composed with a unigram language model on the GPU, and partial hypotheses were rescored on-the-fly using a large language model stored on the CPU. In this paper, we extend our previous algorithm to enable on-the-fly rescoring to be performed over anH-level network composed with anyn-gram language model, and show that using a longer language model history in the H-level network improves decoding speed. We demonstrate that large language models can be applied on-the-fly with no degradation in decoding speed, realizing a LVCSR system that performs recognition over 22 faster than a CPU implementation with no loss in recognition accuracy.

Full Text