Abstract

In this paper we compare a standard HMM based recognizer to a highly parameter efficient hybrid denoted hidden neural network (HNN). The comparison was done on a speaker independent command word recognition task aimed at car hands-free applications. Monophone based HMM and HNN recognizers were initially trained on clean Wall Street Journal British English data. Evaluation of these baseline models on noisy car speech data indicated superior performance of the HMMs. After smoothing to the car environment, however, an HNN with 28k parameters provided a relative error rate reduction of 23-53% over HMMs containing 21k-168k parameters. Due to the low number of parameters in the HNNs, they have a real-time decoding complexity 2-4 times below that of comparable HMMs. The low memory and computational requirements of the HNN makes it particularly attractive for implementation on portable commercial hardware like mobile phones and personal digital assistants.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call