Deep Learning Based Language Modeling for Domain-Specific Speech Recognition

Jing Zhu,Xinwei Gong,Guilin Chen

doi:10.1007/978-981-10-4211-9_24

Abstract

Recurrent neural network (RNN) applied as language model of automatic speech recognition (ASR) can capture all previous history of word sequences and is theoretically superior to the N-gram language model. RNN has successfully applied in acoustic model of ASR and greatly improved the performance. In this paper, we combine RNN and N-gram language models and apply to a domain-specific speech recognition task. A rectification mechanism was introduced during RNN language model training procedure to prevent early stop and tried to deal with gradient vanishing issue. For Chinese word segmentation, a universal algorithm was designed for identifying proper name automatically, which improves word segmentation accuracy by 2.0–3.0%. Our experimental results show the new ASR system combing N-gram language model and domain-specific neural network language model can achieve lower word error rate than that with standard N-gram language model significantly.

Full Text