Abstract

The standard i-vector/Gaussian probabilistic linear discriminant analysis (G-PLDA) system does not compensate for duration mismatch, which is a significant confounding factor in short duration speaker verification. A novel duration compensation technique to normalise the distribution mismatch caused by duration variation in the i-vector space is proposed. The proposed technique involves the use of two factor analysers that are tied together to share latent variables for a given speaker as the underlying generative model of the i-vector space. This leads to a transform which maps the original i-vectors onto a latent subspace that is expected to be duration invariant. The proposed method has the advantages that it normalises distribution mismatch while taking into consideration both inter- and intra-speaker variability. Experiments conducted on NIST SRE 2010 database shows that the proposed method leads to 18.54, 15.48 and 8.77% relative improvements when tested on utterances of 10, 5 and 3 s durations, respectively, compared with the best results obtained by either standard i-vector/G-PLDA or the previously proposed twin model G-PLDA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call