Speaker-independent speech recognition using a neural prediction model

Ken‐Ichi Iso,Takao Watanabe

doi:10.1002/ecjc.4430740803

Abstract

This paper proposes a speech recognition system based on the pattern prediction using neural network. In the proposed system, an independent nonlinear predictor composed of a series of multilayer perceptrons (MLP) is prepared for each class which is the object of recognition. The temporal structure of the speech pattern, especially the temporal correlation structure between feature vector sequence, is represented by the nonlinear mapping between the input and the output, and is utilized as the important feature in the recognition. On the other hand, the variation of the temporal structure of the speech pattern, due to the difference of speakers and the fluctuation of the utterance, is normalized by the dynamic programming. As the training algorithm to determine the MLP parameters composing each predictor, an iterative algorithm combining the dynamic programming and the error backpropagation is proposed, together with the proof for the convergence. A speaker independent isolated digit recognition experiment is executed to examine the basic operation of the proposed system. The parameters are estimated in a satisfactory way even from a small number of training data, and it is indicated that a high recognition performance is realized.

Full Text