Abstract
Continuous density hidden Markov models (CD-HMMs) are an essential component of modern systems for automatic speech recognition (ASR). These models assign probabilities to the sequences of acoustic feature vectors extracted by signal processing of speech waveforms. In this chapter, we investigate a new framework for parameter estimation in CD-HMMs. Our framework is inspired by recent parallel trends in the fields of ASR and machine learning. In ASR, significant improvements in performance have been obtained by discriminative training of acoustic models. In machine learning, significant improvements in performance have been obtained by discriminative training of large margin classifiers. Building on both these lines of work, we show how to train CD-HMMs by maximizing an appropriately defined margin between correct and incorrect decodings of speech waveforms. We start by defining an objective function over a transformed parameter space for CD-HMMs, then describe how it can be optimized efficiently by simple gradient-based methods. Within this framework, we obtain highly competitive results for phonetic recognition on the TIMIT speech corpus. We also compare our framework for large margin training to other popular frameworks for discriminative training of CD-HMMs.
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have