Abstract

This paper describes a novel design of neural network based speech recognition system for isolated Cantonese syllables. Since Cantonese is a monosyllabic and tonal language, the recognition system consists of a tone recognizer and a base syllable recognizer. The tone recognizer adopts the architecture of a multi-layer perceptron in which each output neuron represents a particular tone. The syllable recognizer contains a large number of independently trained recurrent networks, each representing a designated Cantonese syllable. Such a modular structure provides greater flexibility to expand the system vocabulary progressively by adding new syllable models. To demonstrate the effectiveness of the proposed method, a speaker-dependent recognition system has been built with the vocabulary growing from 40 syllables to 200 syllables. In the case of 200 syllables, a top-1 recognition accuracy of 81.8% has been attained and the top-3 accuracy is 95.2%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call