Abstract

This paper introduces the use of a new method of feature extraction based on frequency-time analysis approach for text-independent speaker identification. The impetus for this new feature extraction technique comes from the filter bank summation method of STFT using Nyquist filter bank. The focus of this work is on applications which yield higher identification accuracy without increasing the computational effort. We have proposed this transform from speaker identification perspective. The proposed transformation can be used for both uniform width filter bank and non-uniform width filter bank representation. A complete experimental evaluation was conducted on a database of 61 speakers with Gaussian mixture speaker model. This new feature extraction technique has been compared with Mel-frequency cepstral coefficient (MFCC) feature. The average accuracy of MFCC feature set was 88.05%. The average accuracy of proposed feature set with uniform width filter bank and non uniform width filter bank was 90.24% and 90.42% respectively. The average accuracy was 92.26% after score level fusion of uniform width filter bank feature and non uniform width filter bank feature of the proposed transformation. The discrimination capability of the proposed feature sets have been evaluated statistically using F-ratio and J-measure. Experimental results show that the proposed feature sets have higher discrimination capability compared to MFCC feature set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call