Abstract

In this paper, we investigate the use of the proposed non-parametric exemplar-based acoustic modeling for the NIST Open Keyword Search 2015 Evaluation. Specifically, kernel-density model is used to replace GMM in HMM/GMM (Hidden Markov Model / Gaussian Mixture Model) or DNN in HMM/DNN (Hidden Markov Model / Deep Neural Network) acoustic model to predict the emission probability of HMM states. To get further improvement, likelihood score generated by the kernel-density model is discriminatively tuned by the score tuning module realized by a neural network. Various configurations for score tuning module have been examined to show that simple neural network with 1 hidden layer is sufficient to fine tune the likelihood score generated by the kernel-density model. With this architecture, our exemplar-based model outperforms the 9-layer-DNN acoustic model significantly for both the speech recognition and keyword search tasks. In addition, our proposed exemplar-based system provides complementary information to other systems and we can further benefit from system combination.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call