Abstract
The mismatch between the training and the testing environments greatly degrades the performance of speaker recognition. Although many robust techniques have been proposed, the mismatch problem is still a challenge for speaker recognition system. To solve this problem, we propose an optimized dictionary based sparse representation for robust speaker recognition. To this end, we first train a speech dictionary and a noise dictionary, and concatenate them for sparse representation; then design an optimization algorithm to reduce the mutual coherence between the two learned dictionaries; after that, utilize mixture k-means to model speaker corresponding to sparse feature; and finally, present a distance divergence to measure the similarity. Compared with the Mel-frequency cepstral coefficients based speaker recognition, our preliminary experiments show that the proposed recognition framework consistently improve the robustness in the mismatched condition.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have