Bayesian duration modeling and learning for speech recognition

Jen-Tzung Chien Jen-Tzung Chien,Chih-Hsien Huang Chih-Hsien Huang

doi:10.1109/icassp.2004.1326158

Abstract

We present Bayesian duration modeling and learning for speech recognition under nonstationary speaking rates and noise conditions. In this study, the Gaussian, Poisson and gamma distributions are investigated, to characterize duration models. The maximum a posteriori (MAP) estimate of the gamma duration model is developed. To exploit the sequential learning, we adopt the Poisson duration model, incorporated with gamma prior density, which belongs to the conjugate prior family. When the adaptation data are sequentially observed, the gamma posterior density is produced for twofold advantages. One is to determine the optimal quasi-Bayes (QB) duration parameter, which can be merged in HMM's for speech recognition. The other one is to build the updating mechanism of gamma prior statistics for sequential learning. An expectation-maximization algorithm is applied to fulfill parameter estimation. In the experiments, the proposed Bayesian approaches significantly improve the speech recognition performance of Mandarin broadcast news. Batch and sequential learning are investigated for MAP and QB duration models, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bayesian duration modeling and learning for speech recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Bayesian learning of speech duration models
Jen-Tzung Chien ... Chih-Hsien Huang
IEEE Transactions on Speech and Audio Processing | VOL. 11
Jen-Tzung Chien, et. al. Jen-Tzung Chien ... Chih-Hsien Huang
01 Nov 2003
IEEE Transactions on Speech and Audio Processing | VOL. 11

Study of deep learning and CMU sphinx in automatic speech recognition
Abhishek Dhankar
-
Abhishek DhankarAbhishek Dhankar
01 Sep 2017
01 Sep 2017

Discriminative Learning for Speech Recognition
Xiaodong He ... Li Deng
-
Xiaodong He, et. al.Xiaodong He ... Li Deng
01 Jan 2008
01 Jan 2008

Discriminatively estimated discrete, parametric and smoothed-discrete duration models for speech recognition
Maider Lehr ... Izhak Shafran
-
Maider Lehr, et. al.Maider Lehr ... Izhak Shafran
01 May 2011
01 May 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian duration modeling and learning for speech recognition

Abstract

Talk to us

Similar Papers