Abstract

The i-vector representation and probabilistic linear discriminant analysis (PLDA) have shown state-of-the-art performance in many speaker verification systems. However, in real-world environments, additive and convolutive noise cause mismatches between training and recognition conditions, degrading the performance. In this paper, a fusion system that combines a multi-condition PLDA model and a mixture of SNR-dependent PLDA models is proposed to make the verification system noise robust. The SNR of test utterances is used to determine the best SNR-dependent PLDA model to score against the target-speaker's i-vectors. The performance of the fusion system is demonstrated on NIST 2012 SRE. Results show that the SNR-dependent PLDA models can reduce EER and that the fusion system is more robust than the conventional i-vector/PLDA systems under noisy conditions. It is also found that the SNR-dependent PLDA models are insensitive to Z-norm parameters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.