In-domain versus out-of-domain training for text-dependent JFA

Patrick Kenny,Themos Stafylakis,M J Alam,Pierre Ouellet,Marcel Kockmann

doi:10.21437/interspeech.2014-330

Abstract

We propose a simple and effective strategy to cope with dataset shifts in text-dependent speaker recognition based on Joint Factor Analysis (JFA). We have previously shown how to compensate for lexical variation in text-dependent JFA by adapting the Universal Background Model (UBM) to individual passphrases. A similar type of adaptation can be used to port a JFA model trained on out-of-domain data to a given text-dependent task domain. On the RSR2015 test set we found that this type of adaptation gave essentially the same results as in-domain JFA training. To explore this idea more fully, we experimented with several types of JFA model on the CSLU speaker recognition dataset. Taking a suitably configured JFA model trained on NIST data and adapting it in the proposed way results in a 22% reduction in error rates compared with the GMM/UBM benchmark. Error rates are still much higher than those that can be achieved on the RSR2015 test set with the same strategy but cheating experiments suggest that if large amounts of in-domain training data are available, then JFA modelling is capable in principle of achieving very low error rates even on hard tasks such as CSLU.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

In-domain versus out-of-domain training for text-dependent JFA

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

JFA-based front ends for speaker recognition
Patrick Kenny ... Pierre Ouellet
-
Patrick Kenny, et. al.Patrick Kenny ... Pierre Ouellet
01 May 2014
01 May 2014

Variational Bayesian Joint factor analysis for speaker verification
Xianyu Zhao ... Jiqing Liu
-
Xianyu Zhao, et. al. Xianyu Zhao ... Jiqing Liu
01 Apr 2009
01 Apr 2009

Text-Dependent Speaker Recognition With Random Digit Strings
Themos Stafylakis ... Md Jahangir Alam
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 24
Themos Stafylakis, et. al.Themos Stafylakis ... Md Jahangir Alam
01 Jul 2016
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 24

DNN i-vector based Fishervoice and PLDA SVM scoring for NIST SRE 2016
Jinghua Zhong ... Helen Meng
-
Jinghua Zhong, et. al.Jinghua Zhong ... Helen Meng
01 Nov 2018
DNN i-vector based Fishervoice and PLDA SVM scoring for NIST SRE 2016
Jinghua Zhong ... Helen Meng

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

In-domain versus out-of-domain training for text-dependent JFA

Abstract

Talk to us

Similar Papers