Abstract

support vector machine, SVM, speaker identification, speaker verification, KL divergence, Kullback-Leibler divergence, probabilistic distance kernels, multimedia One major SVM weakness has been the use of generic kernel functions to compute distances among data points. Polynomial, linear, and Gaussian are typical examples. They do not take full advantage of the inherent probability distributions of the data. Focusing on audio speaker identification and verification, we propose to explore the use of novel kernel functions that take full advantage of good probabilistic and descriptive models of audio data. We explore the use of generative speaker identification models such as Gaussian Mixture Models and derive a kernel distance based on the Kullback-Leibler (KL) divergence between generative models. In effect our approach combines the best of both generative and discriminative methods. Our results show that these new kernels perform as well as baseline GMM classifiers and outperform generic kernel based SVM’s in both speaker identification and verification on two different audio databases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call