Joint estimation of feature transformation parameters and Gaussian mixture model for speaker identification

Kuo-Hwei Yuo,Hsiao-Chuan Wang

doi:10.1016/s0167-6393(99)00017-5

Abstract

The Karhunen–Loève transform is a well-known technique for orthonormally mapping features into an uncorrelated space. The Gaussian mixture model (GMM) with diagonal covariance matrices is a popular technique for modeling the speech feature distributions. These two techniques can be combined to improve the performance of speaker or speech recognition systems. The drawback of the combination is that both set of parameters are not optimized together. This paper presents a new model structure that integrates both orthonormal transformation and diagonal-covariance Gaussian mixture into a unified framework. All parameters of this model are obtained simultaneously by Maximum Likelihood estimation. This idea is further extended to attain a new GMM with generalized covariance matrices (GC–GMM). The traditional GMM with diagonal or full covariance matrices is a special case of the GC–GMM. The proposed method is demonstrated on a 100-person connected digit database for text independent speaker identification. In comparison with the traditional GMM, the computational complexity and the number of parameters can be greatly reduced with no degradation in system performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Joint estimation of feature transformation parameters and Gaussian mixture model for speaker identification

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Jul 1, 1999
Citations: 15

Similar Papers

Parameter sharing and minimum classification error training of mixtures of factor analyzers for speaker identification
H Yamamoto ... Y Nankaku
-
H Yamamoto, et. al.H Yamamoto ... Y Nankaku
17 May 2004
17 May 2004

Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification
H Yamamoto
IEICE Transactions on Information and Systems | VOL. E88-D
H YamamotoH Yamamoto
01 Mar 2005
IEICE Transactions on Information and Systems | VOL. E88-D

Robust HMM phoneme modeling for different speaking styles
T Matsuoka ... K Shikano
-
T Matsuoka, et. al.T Matsuoka ... K Shikano
01 Jan 1991
01 Jan 1991

Introduction of a reliability measure in missing data approach for robust speech recognition
...
-
, et. al. ...
01 Sep 2000
01 Sep 2000

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Joint estimation of feature transformation parameters and Gaussian mixture model for speaker identification

Abstract

Talk to us

Similar Papers

More From: Speech Communication