Discriminative transform for confidence estimation in Mandarin speech recognition

Gang Guo Gang Guo,Ren-Hua Wang Ren-Hua Wang

doi:10.1109/chinsl.2004.1409638

Abstract

In automatic speech recognition (ASR) applications, log likelihood ratio testing (LRT) is one of the most popular techniques to obtain a confidence measure (CM). Unlike the traditional (log likelihood ratio) LLR related method, we apply nonlinear transformations towards LLR before computing string-level CM. Different phonemes may have different transformation functions. Through suitable LLR transformations, the verification performance of those string-level CM may increase. Transformation functions are implemented by a multilayer perceptron (MLP). Two algorithms are used to optimize the parameters of the MLP: one is the minimum verification error (MVE) algorithm; another is the figure-of-merit (FOM) training algorithm. In our Mandarin command recognition system, the two methods remarkably improve the performance of confidence measures for out-of-vocabulary word rejection compared with the performance of standard LRT related CM, and we obtain a best 45.5% relative reduction in equal error rate (EER). In addition, in our Mandarin command recognition experiments, the FOM training algorithm outperforms the MVE algorithm even they share an approximately same best performance, while due to limited experimental setups in our experiments, which algorithm is the better still needs to be explored.

Full Text