A Bayesian approach for building triphone models for continuous speech recognition

Ji Ming Ji Ming,M Owens,F.J Smith,P O'Boyle

doi:10.1109/89.799693

Abstract

This paper introduces a new statistical framework for constructing triphonic models from models of less context-dependency. This composition reduces the number of models to be estimated by higher than an order of magnitude and is therefore of great significance in relieving the data sparsity problem in triphone-based continuous speech recognition. The new framework is derived from Bayesian statistics, and represents an alternative to other triphone-by-composition techniques, particularly to the model-interpolation and quasitriphone approaches. The potential power of this new framework is explored by an implementation based on the hidden Markov modeling technique. It is shown that the new model structure includes the quasitriphone model as a special case, and leads to more efficient parameter estimation than the model-interpolation method. Phone recognition experiments show an increase in the accuracy over that obtained by comparable models.

Full Text