An Automatic Speaker Recognition System

P Chakraborty,Md Monirul Kabir,Md Shahjahan,Kazuyuki Murase,F Ahmed

doi:10.1007/978-3-540-69158-7_54

Abstract

Speaker Recognition is the process of identifying a speaker by analyzing spectral shape of the voice signal. This is done by extracting & matching the feature of voice signal. Mel-frequency Cepstrum Co-efficient (MFCC) is the feature extraction technique in which we will get some coefficients named Mel-Frequency Cepstrum coefficient. This Cepstrum Co-efficient is extracted feature. This extracted feature is taken as the input of Vector Quantization process. Vector Quantization (VQ) is the typical feature matching technique in which VQ codebook is generated by providing pre-defined spectral vectors for each speaker to cluster the training vectors in a training session. Finally test data are provided for searching the nearest neighbor to match that data with the trained data. The result is to recognize correctly the speakers where music & speech data (Both in English & Bengali format) are taken for the recognition process. The correct recognition is almost ninety percent. It is comparatively better than Hidden Markov model (HMM) & Artificial Neural network (ANN).KeywordsMFCC- Mel-Frequency Cepstrum Co-efficientDCT: Discrete cosine TransformIIR: - Infinite impulse responseFIR: - Finite impulse responseFFT: - Fast Fourier TransformVQ: - Vector Quantization

Full Text