Speaker Identification Using a Novel Combination of Sparse Representation and Gaussian Mixture Models

Yun Jie Ma

doi:10.4028/www.scientific.net/amm.615.265

Abstract

In recent years, sparse representation has become a very popular method for pattern recognition which could outperform the traditional methods. This paper presents a novel combination of sparse representation and traditional Gaussian mixture models. Each person’s dictionary or termed as subspace in this paper are learned using K-SVD algorithm while the entries are GMM mean matrixes union for each speaker. Then project the test utterance into each dictionary and finally make decision depending on the reconstruction errors. The experiments are conducted on the database collected in our anechoic chamber. The proposed approach results in different accuracy for different sparsity and dictionary size. In appropriate parameters, the accuracy can reach 98.5% which is fairly good.

Full Text