Abstract

This paper describes a progressive language model (LM) adaptation method for transcribing broadcast news in a sudden event such as a massive earthquake. In a practical automatic speech recognition (ASR) system, the new event whose linguistic contexts are not covered with the LM often causes a serious degradation of the performance. Furthermore, there might be not enough quantities of training texts for conventional LM adaptation such as linear interpolation. Then, we propose a new LM adaptation method by using ASR transcriptions as unsupervised training texts in addition to the online manuscripts written by reporters. The proposed method employs a progressive update procedure, which adapts LMs in an unsupervised manner by using every set of transcriptions in a short period for the purpose of immediate use of the adapted model. The method also uses the online manuscripts in order to adapt the LM and add new words into the vocabulary. Experimental results showed that the proposed progressive LM adaptation method reduced relatively a word error rate by 8.2% compared with the conventional LM adaptation method with the online manuscripts only.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call