Abstract

A multilevel training procedure is proposed for automatic speech recognition in noise. In the multilevel training procedure, six levels of Gaussian noise, corresponding to six signal-to-noise ratios (SNR), were added to the training samples. The procedure was tested in three speech recognizers implemented, respectively, on the basis of the following three speech recognition techniques: dynamic time warping (DTW), discrete hidden Markov modeling (DHMM), and neural network (NN). The tests on the three recognizers were conducted using a common database for isolated word recognition of the ten digits. The results of the first series of the tests showed that when the recognizers were trained using clean training samples without noise and tested with noisy test samples with additive Gaussian noise, the performance of the three recognizers degraded significantly as the SNR decreased. In the second series of tests, the multilevel training procedure was applied. The results showed that, when the recognizers were trained with noisy training samples and tested with noisy test samples, the three recognizers were, in contrast to the first series, much better able to maintain their performance as the SNR decreased. The multilevel procedure is therefore effective in improving the performance of recognizers for automatic recognition in noise.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.