Abstract

Speaker recognition using support vector machines (SVMs) with features derived from generative models has been shown to perform well. Typically, a universal background model (UBM) is adapted to each utterance yielding a set of features that are used in an SVM. We consider the case where the UBM is a Gaussian mixture model (GMM), and maximum likelihood linear regression (MLLR) adaptation is used to adapt the means of the UBM. Recent work has examined this setup for the case where a global MLLR transform is applied to all the mixture components of the QMM UBM. This work produced positive results that warrant examining this setup with multi-class MLLR adaptation, which groups the UBM mixture components into classes and applies a different transform to each class. This paper extends the MLLR/GMM framework to the multi- class case. Experiments on the NIST SRE 2006 corpus show that multi-class MLLR improves on global MLLR and that the proposed system's performance is comparable with state of the art systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call