Abstract

Recent works have reported the successful use of sparse representation (SR) over learned dictionary for speaker verification (SV) task. For large variability practical data, the SR based approaches are noted to produce inconsistent sparse coding. In other words, for the true-target trials, the dominant coefficients in the sparse codes of enrollment and test data happen to involve different atoms of the dictionary. This, in turn, enhances the false rejection rate. In this work, we propose a novel yet simple way to address that problem. The key idea is to exploit the sparse coding of enrollment data in finding the representation of the test data. As the proposed constraint affects the false alarm rate, the multi-offset decimation diversity is introduced to address the same. The combined approach has lower computational complexity yet shown to outperform the existing factor analysis based SV approach when evaluated on a large variability NIST 2012 speaker recognition evaluation dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call