Speaker recognition using pyramid match kernel based support vector machines

A D Dileep,C Chandra Sekhar

doi:10.1007/s10772-012-9154-4

Abstract

Gaussian mixture model (GMM) based approaches have been commonly used for speaker recognition tasks. Methods for estimation of parameters of GMMs include the expectation-maximization method which is a non-discriminative learning based method. Discriminative classifier based approaches to speaker recognition include support vector machine (SVM) based classifiers using dynamic kernels such as generalized linear discriminant sequence kernel, probabilistic sequence kernel, GMM supervector kernel, GMM-UBM mean interval kernel (GUMI) and intermediate matching kernel. Recently, the pyramid match kernel (PMK) using grids in the feature space as histogram bins and vocabulary-guided PMK (VGPMK) using clusters in the feature space as histogram bins have been proposed for recognition of objects in an image represented as a set of local feature vectors. In PMK, a set of feature vectors is mapped onto a multi-resolution histogram pyramid. The kernel is computed between a pair of examples by comparing the pyramids using a weighted histogram intersection function at each level of pyramid. We propose to use the PMK-based SVM classifier for speaker identification and verification from the speech signal of an utterance represented as a set of local feature vectors. The main issue in building the PMK-based SVM classifier is construction of a pyramid of histograms. We first propose to form hard clusters, using k-means clustering method, with increasing number of clusters at different levels of pyramid to design the codebook-based PMK (CBPMK). Then we propose the GMM-based PMK (GMMPMK) that uses soft clustering. We compare the performance of the GMM-based approaches, and the PMK and other dynamic kernel SVM-based approaches to speaker identification and verification. The 2002 and 2003 NIST speaker recognition corpora are used in evaluation of different approaches to speaker identification and verification. Results of our studies show that the dynamic kernel SVM-based approaches give a significantly better performance than the state-of-the-art GMM-based approaches. For speaker recognition task, the GMMPMK-based SVM gives a performance that is better than that of SVMs using many other dynamic kernels and comparable to that of SVMs using state-of-the-art dynamic kernel, GUMI kernel. The storage requirements of the GMMPMK-based SVMs are less than that of SVMs using any other dynamic kernel.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speaker recognition using pyramid match kernel based support vector machines

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology

Lead the way for us

Journal: International Journal of Speech Technology	Publication Date: Jun 12, 2012
Citations: 11

Similar Papers

Speaker Identification Using Intermediate Matching Kernel-Based Support Vector Machines
A. D. Dileep ... C. Chandra Sekhar
-
A. D. Dileep, et. al.A. D. Dileep ... C. Chandra Sekhar
04 Oct 2011
04 Oct 2011

Class-specific GMM based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines
A.D Dileep ... C Chandra Sekhar
Speech Communication | VOL. 57
A.D Dileep, et. al.A.D Dileep ... C Chandra Sekhar
07 Oct 2013
Speech Communication | VOL. 57

GMM-based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines.
Aroor Dinesh Dileep ... Chellu Chandra Sekhar
IEEE Transactions on Neural Networks and Learning Systems | VOL. 25
Aroor Dinesh Dileep, et. al.Aroor Dinesh Dileep ... Chellu Chandra Sekhar
01 Aug 2014
IEEE Transactions on Neural Networks and Learning Systems | VOL. 25

Kernel methods based approaches to image classification and retrieval
C Chandra Sekhar ... Raj Kumar Buyya
-
C Chandra Sekhar, et. al.C Chandra Sekhar ... Raj Kumar Buyya
01 Dec 2012
01 Dec 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speaker recognition using pyramid match kernel based support vector machines

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology