Abstract

Feature extraction is one of the most crucial stages of Speaker Identification (SI), which significantly influences the performance of SI system. There are many feature extraction approaches. The most popular and the most commonly used in this area are Mel-Frequency Cepstral Coefficients (MFCCs). The MFCC features give a good performance in the same environmental conditions (train/test). However, they are very sensitive in the presence of background conditions, which significantly decrease SI system performance. To overcome this problem we have proposed to use the mean and variance normalization then we apply auto-regression moving-average filtering (MVA) to MFCC features, and then we combine the resulting features with MFCC features. Our Text-independent SI system is based on Gaussian Mixture Model (GMM), and the experiments were conducted using TIMIT database. Our method proves to be promising, achieving up to 28% accuracy at signal to noise ratio (SNR) 5 dB.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call