Abstract

The mismatch between the training conditions and the test conditions severely degrades the performance of speaker verification. Aiming at solving this problem, this paper presents a method which uses a combination of improved nonnegative matrix factorization (IMNMF) and mel frequency cepstral coefficients (MFCC) with feature warping for improving identity-vector (i-vector) speaker verification performance in noisy environment. Unlike the traditional nonnegative matrix factorization (NMF), IMNMF uses extra free basis vectors to capture the features which are not included in training data, and linear constraints on dictionary atoms. Feature warping is used to remove channel noises. Therefore, the proposed method can reduce distortion of reconstructed speech while enhancing the recovered speech quality. The performance of i-vector speaker verification is evaluated by using the short utterance database and the NOISEX-92 database. The experiment results indicate that the score level fusion of feature-warped IMNMF-MFCC and feature-warped MFCC is superior to the baseline system at the equal error rate under the majority of signal-to-noise ratios (SNRs).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call