Study on pairwise LDA for x‐vector‐based speaker recognition

A Kanagasundaram,S Sridharan,C Fookes,S Ganapathy

doi:10.1049/el.2019.0880

Abstract

In typical x-vector-based speaker recognition systems, standard linear discriminant analysis (LDA) is used to transform the x-vector space with the aim of maximising the between-speaker discriminant information while minimising the within-speaker variability. For LDA, it is customary to use all the available speakers in the speaker recognition development dataset. In this study, the authors investigate if it would be more beneficial to estimate the between-speaker discriminant information and the within-speaker variability using the most confusing samples and the most distant samples (from the target speaker mean), respectively, in the LDA-based channel compensation. The between-speaker variance is estimated using a pairwise approach where the most confusing non-target speaker samples are found based on the Euclidean distance between the speaker mean and adjacent speaker's samples. The within-speaker variance is estimated using the mean of each speaker and the furthermost samples in the speaker sessions. Experimental results demonstrate the proposed LDA approach for an x-vector-based speaker recognition system achieves over 17% relative improvement on equal error rate over standard LDA-based x-vector speaker recognition systems on the NIST2010 corext-corext condition.

Full Text