Abstract

The state-of-the-art speaker recognition system degrades performance rapidly dealing with short-time utterances. It is known to all that identity vectors (i-vectors) extracted from short utterances have large uncertainties and standard Probabilistic linear discriminant analysis (PLDA) method can not exploit this uncertainty to reduce the effect of duration variation. In this work, we use Shared mixture of PLDA (SM-PLDA) to remodel the i-vectors utilizing their uncertainties. SM-PLDA is an improved generative model with a shared intrinsic factor, and this factor can be regarded as an identity vector containing speaker indentification information. This identity vector can be modeled by PLDA. Experimental results are evaluated by both equal error rate and minimum detection cost function. The results conducted on the National institute of standards and technology (NIST) Speaker recognition evaluation (SRE) 2010 extended tasks show that the proposed method has achieved significant improvements compared with ivector/ PLDA and some other advanced methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call