Abstract

A new modular recurrent neural network (MRNN)-based method for continuous Mandarin speech recognition is proposed. The system uses five RNNs to accomplish many subtasks separately and then combine them to integrally solve the problem. They include two RNNs for the discrimination of the two sub-syllable groups of 100 right-final-dependent (RFD) initials and 39 context independent (CI) finals, two RNNs for the generation of dynamic weighting functions for sub-syllable's integration, and one RNN for syllable boundary detection. All RNN modules are combined using a delay-decision Viterbi search. The method differs from the ANN/HMM hybrid approach of using ANNs to perform not only sub-syllables discrimination but also temporal structure modeling of the speech signal. The system is trained using a three-stage training method embedding with the MCE/GPD algorithms. Besides, a fast recognition method using multi-level pruning is also proposed. Experimental results showed that it outperforms the HMM method on both the recognition accuracy and the computational complexity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.