Abstract

Currently, the majority of the state-of-the-art speaker verification systems are based on i-vector and PLDA; however, PLDA requires a huge volume of development data from multiple different speakers. This makes it difficult to learn PLDA parameters for a domain with scarce data. In this paper, we study and extend an effective transfer learning method based on Bayesian joint probability, in which the Kullback–Leibler (KL) divergence between the source domain and the target domain is added as a regularization factor. This method utilizes the development data from the source domain to help find the optimal PLDA parameters for the target domain. Specifically, speaker verification of short utterances can be viewed as a task in the domain with a limited amount of long utterances. Therefore, transfer learning for PLDA can also be adopted to learn discriminative information from other domains with a great deal of long utterances. Experimental results based on the NIST SRE and Switchboard corpus demonstrate that the proposed method offers a significant performance gain when compared with the traditional PLDA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call