Abstract
In this paper we investigate techniques to combine hybrid HMM-DNN (hidden Markov model - deep neural network) and tandem HMM-GMM (hidden Markov model - Gaussian mixture model) acoustic models using: (1) model averaging, and (2) lattice combination with Minimum Bayes Risk decoding. We have performed experiments on the “TED Talks” task following the protocol of the IWSLT-2012 evaluation. Our experimental results suggest that DNN-based and GMM-based acoustic models are complementary, with error rates being reduced by up to 8% relative when the DNN and GMM systems are combined at model-level in a multi-pass automatic speech recognition (ASR) system. Additionally, further gains were obtained by combining model-averaged lattices with the one obtained from baseline systems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.