Abstract

Feature space maximum likelihood linear regression (fMLLR) is a widely used technique for speaker adaptation in HMM-based speech recognition. However, in extremely resource constrained systems the time required to perform the sufficient statistics accumulation for fMLLR adaptation can be considerable. In this paper we describe a novel method that can lead to significant reduction in the time taken for statistics accumulation while preserving the adaptation gains. The proposed quick fMLLR (Q-fMLLR) algorithm is implemented in a state-of-the-art large-vocabulary continuous speech recognition system, and evaluated on a broadcast transcription task. We present results both in terms of the average likelihood after adaptation and the character error rate. It is shown that Q-fMLLR attains the performance of regular fMLLR with a fraction of the computation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call