Abstract

The state-of-the-art speaker recognition system degrades performance rapidly dealing with short-time utterances. It is known to all that identity vectors (i-vectors) extracted from short utterances have large uncertainties and standard Probabilistic linear discriminant analysis (PLDA) method can not exploit this uncertainty to reduce the effect of duration variation. In this work, we use Shared mixture of PLDA (SM-PLDA) to remodel the i-vectors utilizing their uncertainties. SM-PLDA is an improved generative model with a shared intrinsic factor, and this factor can be regarded as an identity vector containing speaker indentification information. This identity vector can be modeled by PLDA. Experimental results are evaluated by both equal error rate and minimum detection cost function. The results conducted on the National institute of standards and technology (NIST) Speaker recognition evaluation (SRE) 2010 extended tasks show that the proposed method has achieved significant improvements compared with ivector/ PLDA and some other advanced methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.