Abstract

Our ongoing work that applies Fishervoice to map joint factor analysis (JFA)-mean supervectors 1 into a compressed discriminant subspace has shown that performing cosine distance scoring on the Fishervoice projected vectors outperforms classical JFA. In this paper, we refine Fishervoice for low-dimensional i-vectors by only using the nonparametric between-class scatter matrix to substitute the parametric one in linear discriminative analysis (LDA). The task of 2016 speaker recognition evaluation (SRE16) only has unlabeled in-domain training data and labeled out-of-domain training data for model training. Support vector machine (SVM) scoring can capture the discriminative information embedded in the unlabeled in-domain training data. We perform probabilistic linear discriminant analysis (PLDA) before SVM scoring for inter-session compensation with speaker label information from out-of-domain training data. This approach constitutes CUHK’s submission for SRE16. In this paper, we present a detailed analysis of the approaches and the performance gains with refined Fishervoice and PLDA SVM scoring.1The JFA-mean supervector of an utterance is a GMM supervector obtained from the JFA model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call