An analytic study on clustering driven self-supervised speaker verification

Abderrahim Fathan,Jahangir Alam

doi:10.1016/j.patrec.2024.01.024

Abderrahim Fathan, Jahangir Alam

https://doi.org/10.1016/j.patrec.2024.01.024

Copy DOI

Export

Save

Cite

Journal: Pattern Recognition Letters	Publication Date: Jan 29, 2024
Citations: 3

Abstract
Full-Text
Similar Papers

Abstract

Listen

One of the most widely used self-supervised speaker verification system training methods is to optimize the speaker embedding network in a discriminative fashion using clustering algorithm-driven Pseudo-Labels. Although the pseudo-labels-based self-supervised training scheme showed impressive performance, recent studies have shown that label noise can significantly impact performance. In this paper, we have explored various pseudo-labels driven by different clustering algorithms and conducted a fine-grained analysis of the relationship between the quality of the pseudo-labels and the speaker verification performance. Experimentally, we shed light on several previously overlooked aspects of the pseudo-labels that can impact speaker verification performance. Moreover, we could observe that the self-supervised speaker verification performance is heavily dependent on multiple qualitative aspects of the clustering algorithms used to generate the pseudo-labels. Furthermore, we show that speaker verification performance can be severely degraded from overfitting the noisy pseudo-labels and that the mixup strategy can mitigate the memorization effects of label noise.

Full Text