Two-stage speaker adaptation in subspace Gaussian mixture models

Sina Hamidi Ghalehjegh,Richard C Rose

doi:10.1109/icassp.2014.6854821

Abstract

A two-stage speaker adaptation approach is proposed for the subspace Gaussian mixture model (SGMM) [1] in large vocabulary automatic speech recognition (ASR). The SGMM differs from the more well known continuous density hidden Markov model (CDHMM) in that a large portion of the SGMM parameters are dedicated to shared full covariance Gaussian subspace parameters and a relatively small number of parameters are used for state dependent projection vectors. Both model space and feature space adaptation are investigated. First, an efficient regression based approach for subspace vector adaptation (SVA) is presented. Second, an efficient approach is presented for feature space adaptation using constrained maximum likelihood linear regression (CMLLR) in the SGMM. While both of these adaptation scenarios have previously been investigated in the context of the SGMM [2, 3], a more efficient and numerically stable procedure is presented here for estimating the parameters of the regression based transformations. Both transformation matrices are obtained using an optimization technique that iteratively updates the rows of the regression matrices. It is shown that using these feature space and model space approaches for unsupervised speaker adaptation provides complementary improvements in SGMM based ASR word accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Two-stage speaker adaptation in subspace Gaussian mixture models

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Linear Regression Based Acoustic Adaptation for the Subspace Gaussian Mixture Model
Sina Hamidi Ghalehjegh ... Richard C Rose
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22
Sina Hamidi Ghalehjegh, et. al.Sina Hamidi Ghalehjegh ... Richard C Rose
01 Sep 2014
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22

Acoustic modeling using transform-based phone-cluster adaptive training
Vimal Manohar ... C Bhargav Srinivas
-
Vimal Manohar, et. al.Vimal Manohar ... C Bhargav Srinivas
01 Dec 2013
01 Dec 2013

Regularized constrained maximum likelihood linear regression for speech recognition
Sina Hamidi Ghalehjegh ... Richard C Rose
-
Sina Hamidi Ghalehjegh, et. al.Sina Hamidi Ghalehjegh ... Richard C Rose
01 May 2014
01 May 2014

Investigation of different acoustic modeling techniques for low resource Indian language data
Sriranjani R ... Murali Karthick B
-
Sriranjani R, et. al. Sriranjani R ... Murali Karthick B
01 Feb 2015
01 Feb 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Two-stage speaker adaptation in subspace Gaussian mixture models

Abstract

Talk to us

Similar Papers