Abstract

Discriminative model combination is a new approach in the field of automatic speech recognition, which aims at an optimal integration of all given (acoustic and language) models into one log-linear posterior probability distribution. As opposed to the maximum entropy approach, the coefficients of the log-linear combination are optimized on training samples using discriminative methods to obtain an optimal classifier. Three methods are discussed to find coefficients which minimize the empirical word error rate on given training data: the well-known generalised probabilistic descent (GPD) based minimum error rate training leading to an iterative optimization scheme; a minimization of the mean distance between the discriminant function of the log-linear posterior probability distribution and an ideal discriminant function; and a minimization of a smoothed error count measure, where the smoothing function is a parabola. The latter two methods lead to closed-form solutions for the coefficients of the model combination. Experimental results show that the accuracy of a large vocabulary continuous speech recognition system can be increased by a discriminative model combination, due to a better exploitation of the given acoustic and language models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call