Abstract

Emotional speaker recognition has emerged as an important challenging topic interesting recent researches. The purpose was enhancing the performance of speaker recognition degraded by emotions. Noise robustness becomes also a crucial parameter in speaker recognition system used in reallife conditions. This paper exhibits a methodology for emotional speaker recognition under clean and noisy environments. Mel Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficients (LPCC) and MFCC-Shifted-Delta-Cepstral (SDC) coefficients are used to extract features in order to obtain best performance. The extracted features are then classified using the Hidden Markov Models (HMM). The speech samples are from BERLIN emotional database (EMO-DB) to which we added a real airport noise using various SNR levels. Results reveal that MFCCSDC outperforms the traditional MFCC and LPCC parameters in clean and noisy environments and both MFCC and MFCCSDC give satisfactory results in noisy conditions mostly under neutral and anger emotions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call