Investigations on features for log-linear acoustic models in continuous speech recognition

S Wiesler,R Schluter,M Nussbaum-Thom,G Heigold,H Ney

doi:10.1109/asru.2009.5373362

Abstract

Hidden Markov Models with Gaussian Mixture Models as emission probabilities (GHMMs) are the underlying structure of all state-of-the-art speech recognition systems. Using Gaussian mixture distributions follows the generative approach where the class-conditional probability is modeled, although for classification only the posterior probability is needed. Though being very successful in related tasks like Natural Language Processing (NLP), in speech recognition direct modeling of posterior probabilities with log-linear models has rarely been used and has not been applied successfully to continuous speech recognition. In this paper we report competitive results for a speech recognizer with a log-linear acoustic model on the Wall Street Journal corpus, a Large Vocabulary Continuous Speech Recognition (LVCSR) task. We trained this model from scratch, i.e. without relying on an existing GHMM system. Previously the use of data dependent sparse features for log-linear models has been proposed. We compare them with polynomial features and show that the combination of polynomial and data dependent sparse features leads to better results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Investigations on features for log-linear acoustic models in continuous speech recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Integrate template matching and statistical modeling for continuous speech recognition
Xie Sun
-
Xie SunXie Sun
01 Jan 2010
01 Jan 2010

Discrete-Mixture HMMs-based Approach for Noisy Speech Recognition
Tetsuo Kosaka ... Masaharu Katoh
-
Tetsuo Kosaka, et. al.Tetsuo Kosaka ... Masaharu Katoh
01 Jun 2007
01 Jun 2007

Integrated exemplar-based template matching and statistical modeling for continuous speech recognition
Xie Sun ... Yunxin Zhao
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014
Xie Sun, et. al.Xie Sun ... Yunxin Zhao
01 Feb 2014
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014

Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition
Ahmed Hussen Abdelaziz
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 26
Ahmed Hussen AbdelazizAhmed Hussen Abdelaziz
01 Mar 2018
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Investigations on features for log-linear acoustic models in continuous speech recognition

Abstract

Talk to us

Similar Papers