Abstract

In recent years, sparse representation has become a very popular method for pattern recognition which could outperform the traditional methods. This paper presents a novel combination of sparse representation and traditional Gaussian mixture models. Each person’s dictionary or termed as subspace in this paper are learned using K-SVD algorithm while the entries are GMM mean matrixes union for each speaker. Then project the test utterance into each dictionary and finally make decision depending on the reconstruction errors. The experiments are conducted on the database collected in our anechoic chamber. The proposed approach results in different accuracy for different sparsity and dictionary size. In appropriate parameters, the accuracy can reach 98.5% which is fairly good.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call