Abstract

In automatic speech recognition (ASR), adaptation techniques are used to minimize the mismatch between training and testing conditions. Many successful techniques have been proposed for deep neural network (DNN) acoustic model (AM) adaptation. Recently, recurrent neural networks (RNNs) have outperformed DNNs in ASR tasks. However, the adaptation of RNN AMs is challenging and in some cases when combined with adaptation, DNN AMs outperform adapted RNN AMs. In this paper, we combine student-teacher training and unsupervised adaptation to improve ASR performance. First, RNNs are used as teachers to train student DNNs. Then, these student DNNs are adapted in an unsupervised fashion. Experimental results on the AMI IHM and AMI SDM tasks show that student DNNs are adaptable with significant performance improvements for both frame-wise and sequentially trained systems. We also show that the combination of adapted DNNs with teacher RNNs can further improve the performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call