Abstract
We present various methods for estimating a robust output probability distribution (PD) in speech recognition based on the discrete hidden Markov model (HMM). In speech recognition, we encounter the problem of an insufficient amount of training data, which may cause inaccurate modeling of the HMM parameters, especially the output PD's. In this paper, to enhance the robustness of the output PD's with respect to unseen data, we study two approaches: smoothing and tying of the PD's. We introduce a new algorithm to smooth a PD, where a smoothing matrix is estimated by following the strategy of cross-validation as used in deleted interpolation. As for tying, we derive a number of state classes based on a clustering tree which achieves a good compromise between robustness and detail of the tied PD's in specifying speech feature characteristics. In addition to providing an efficient method for constructing the clustering tree, we suggest a measure that accounts for the variation of estimated PD's under various situations. The performances of the proposed methods are evaluated by speaker-independent isolated word recognition experiments and are shown to be better in recognition accuracy than that of the PD's based on the maximum likelihood criterion. >
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have