Speaker identification using vector quantization and I-vector with reference to Assamese language

Sruti Sruba Bharali,Sanjib Kr Kalita

doi:10.1109/wispnet.2017.8299740

Abstract

This paper describes the implementation of a speaker identification system with reference to Assamese language. The database consists of speech samples that were collected from 15 (fifteen) speakers for ten Assamese words representing the Assamese digits from 0 (shounyo) to 9 (no). Mel Frequency Cepstral Coefficients (MFCC) are used as features for this study. Two independent speaker identification systems have been built in this paper using Vector Quantization (VQ) and I-vector technique. The system built using the I-vector technique obtains comparatively better identification accuracy for speakers when compared with the system developed using VQ technique. Three different systems have been built for both the techniques based on variable feature size. A maximum accuracy of 92.38% is achieved using I-vector technique with 39 MFCC features.

Full Text