Speech recognition using various sequential networks

Seiichi Nakagawa,Isao Hayakawa

doi:10.1002/scj.4690231406

Abstract

In this paper, we describe two neural network-based approaches to speech recognition. One is a purely neural network approach that approximates a multipleorder hidden Markov model (HMM). It is an extension of a sequential network and is called the neural Markov model (NMM). The other approach is a combination of sequential networks and the DP matching (DTW) method. NMM is based on the idea that a sequential network is similar to a hidden Markov model (HMM) with one state; we attempt to approximate an HMM with several states by concatenating sequential networks. For this purpose, we tried two approaches: (1) using only neural networks, and (2) applying the DP matching method to the output of the networks. The speaker-independent recognition rate using the first approach was 65.4 percent for nine syllables that included voiced plosive consonants. The second method produced a 88.1 percent recognition rate for artificial word data consisting of two concatenated syllables. These results outperformed those obtained with the HMM.

Full Text