Abstract

Describes a new neural network structure and a corresponding new sequential training technique for speech recognition. The proposed system is a modification of the original time delay neural network (TDNN) structure of Waibel et al. [1989]. The new structure consists of a group of sub-nets, and each isolated word or phoneme to be recognized corresponds to one sub-net. Since each sub-net deals with only one recognition unit, it may be trained independently. Each sub-net is a TDNN which the authors train with a new sequential training algorithm. The system has attained close to 100% accuracy for a multi-speaker, isolated word recognition task and 86.44% accuracy for a three voiced-stop-consonants (B, D and G), speaker-independent phoneme recognition task. Results for phoneme recognition compared favorably with the best result obtained by Bryant [1992] using Sawai's block windowed neural network architecture with improvement by 14.44% for the same task.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.