Abstract

In this paper, the problem of fast model adaptation for nonnative speakers is addressed from a perspective of model complexity selection. The key challenge lies in reliable complexity selection when only a small amount of adaptation data is available. A novel maximum expected likelihood (MEL) based technique is proposed to enable model complexity selection from using as little as one adaptation sentence. In MEL, the expectation of loglikelihood is computed based on the mismatch bias between model and data which is measured by a small amount of adaptation data, and model complexity is selected to maximize EL. Experiments were performed on WSJ data of speakers with a wide range of foreign accents. The proposed method led to consistent and significant improvement on recognition accuracy over MLLR for nonnative speakers, without performance degradation on native speakers. The proposed method was able to dynamically select optimal model complexity as the available adaptation data increased.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call