Abstract

In this paper, we describe two neural network-based approaches to speech recognition. One is a purely neural network approach that approximates a multipleorder hidden Markov model (HMM). It is an extension of a sequential network and is called the neural Markov model (NMM). The other approach is a combination of sequential networks and the DP matching (DTW) method. NMM is based on the idea that a sequential network is similar to a hidden Markov model (HMM) with one state; we attempt to approximate an HMM with several states by concatenating sequential networks. For this purpose, we tried two approaches: (1) using only neural networks, and (2) applying the DP matching method to the output of the networks. The speaker-independent recognition rate using the first approach was 65.4 percent for nine syllables that included voiced plosive consonants. The second method produced a 88.1 percent recognition rate for artificial word data consisting of two concatenated syllables. These results outperformed those obtained with the HMM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call