Abstract

The performance of the speaker recognition system declines when training and testing audio codecs are mismatched. In this paper, based on analyzing the effect of mismatched audio codecs in the linear prediction cepstrum coefficients, a method of MAP-based audio coding compensation for speaker recognition is proposed. The proposed method firstly sets a standard codec as a reference and trains the speaker models in this codec format, then learns the deviation distributions between the standard codec format and the other ones, next gets the current bias via using a small number adaptive data and the MAP-based adaptive technique, and then adjusts the model parameters by the type of coming audio codec format and its related bias. During the test, the features of the coming speaker are used to match with the adjusted model. The experimental result shows that the accuracy reached 82.4% with just one second adaptive data, which is higher 5.5% than that in the baseline system.

Highlights

  • Speaker recognition is a technology which extracts speaker information from speech signals to identify the speaker's identity

  • The experimental result shows that the accuracy reached 82.4% with just one second adaptive data, which is higher 5.5% than that in the baseline system

  • We study the speaker recognition under stream media codecs, and select four popular known coding or unknown coding algorithm in stream media codecs on MAP-Based Audio Coding Compensation for Speaker Recognition the Internet: mp3 (192 kbps, known coding algorithm), rm (64 kbps, unknown coding algorithm), wma (128 kbps, unknown coding algorithm) and ogg (128 kbps, known coding algorithm)

Read more

Summary

Introduction

Speaker recognition is a technology which extracts speaker information from speech signals to identify the speaker's identity. Many techniques of compensating the degradation caused by this mismatch have been developed They are roughly grouped into two categories, namely 1) feature compensation, in which the process of feature extraction is modified and 2) model adaptation, in which the parameters of recognition models are adjusted. We propose a method of MAP-based audio coding compensation for speaker recognition, which is a model adaptation method. The proposed method first sets a standard codec as a reference and trains the speaker models in this codec format, learns the deviation distributions between the standard codec format and the other ones, gets the current bias via using a small number adaptive data and MAPbased adaptive technique, and adjusts the model parameters by the type of coming audio coding format and the related bias. The features of coming speaker are used to match with the adjusted model, so as to effectively solve the codec mismatch problem

Influence Analysis of Audio Codecs in LPCC Domain
MAP-Based Coding Compensation
H hMk k 1
Experiments and Discussions
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call