Robust Speaker Identification Incorporating High Frequency Features

Latha Latha

doi:10.1016/j.procs.2016.06.064

Abstract

Speaker identification system identifies the person by his/her speech sample. Speaker Identification (SI) system should posses a robust feature extraction unit and a good classifier. Mel frequency cepstral coefficient (MFCC) is very old feature extraction scheme, which has been regarded as standard set of feature vectors for speaker identification. The mel filter bank used in MFCC method, captures the speaker information more effectively in lower frequencies than higher frequencies. Hence high frequency region characteristics are lost. This problem is solved in the proposed method. The speech signal comprises both voiced and unvoiced segments. The voiced segment includes high energy, low frequency components and unvoiced segment includes low energy, high frequency components. In proposed method, the speech sample is divided into voiced and unvoiced segments. The voiced speech segment is filtered using mel filter bank to generate MFCC from lower frequencies of speech signal and unvoiced speech segment is filtered using inverted mel filter bank to generate IMFCC from higher frequencies of speech signal.

Full Text