Fusion of TEO Phase with MFCC Features for Speaker Verification

Purvi Agrawal,Hemant A Patil

doi:10.1145/2708463.2709057

Purvi Agrawal, Hemant A Patil

https://doi.org/10.1145/2708463.2709057

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In the last few years, there has been significant work on using temporal features of speech excitation source, viz., Linear Prediction (LP) residual and its analytic or instantaneous phase, group delay method, glottal glow derivative, etc. for speaker recognition task. In this paper, score-level fusion of Teager Energy Operator (TEO) phase with Mel Frequency Cepstral Coefficients (MFCC) features for text-independent Speaker Verification (SV) task has been proposed. Experiments have been performed on SV system based on Gaussian Mixture Model-Universal Background Model (GMM-UBM). Proposed SV system with fusion of TEO phase with MFCC is found to improve the accuracy of SV system by 3.10% over state-of-the-art MFCC features. This score-level fusion of TEO phase with MFCC performs better than MFCCs alone under matched and mismatched conditions during testing in SV system on 2002 NIST Speaker Recognition Evaluation (SRE) database. This indicates that TEO phase contains information that is complementary to the MFCC features.

Full Text