Enrollee-constrained sparse coding of test data for speaker verification

Nagendra Kumar,Rohit Sinha

doi:10.1016/j.patrec.2018.08.034

Abstract

Recent works have reported the successful use of sparse representation (SR) over learned dictionary for speaker verification (SV) task. For large variability practical data, the SR based approaches are noted to produce inconsistent sparse coding. In other words, for the true-target trials, the dominant coefficients in the sparse codes of enrollment and test data happen to involve different atoms of the dictionary. This, in turn, enhances the false rejection rate. In this work, we propose a novel yet simple way to address that problem. The key idea is to exploit the sparse coding of enrollment data in finding the representation of the test data. As the proposed constraint affects the false alarm rate, the multi-offset decimation diversity is introduced to address the same. The combined approach has lower computational complexity yet shown to outperform the existing factor analysis based SV approach when evaluated on a large variability NIST 2012 speaker recognition evaluation dataset.

Full Text