Speaker identification in emotional talking environments using both gender and emotion cues

Ismail Shahin

doi:10.1109/iccspa.2013.6487314

Abstract

Speaker recognition performance is usually very high in neutral talking environments; however, the performance is significantly degraded in emotional talking environments. This work is devoted to proposing, implementing, and evaluating a new approach to enhance the degraded performance of text-independent speaker identification in emotional talking environments. The new proposed approach is based on identifying the unknown speaker using both his/her gender and emotion cues using Hidden Markov Models (HMMs) as classifiers. This approach has been tested on our collected speech database. The results of this work show that speaker identification performance based on using both gender and emotion cues is higher than that based on using gender cues only, emotion cues only, and neither gender nor emotion cues. The results obtained based on the new proposed approach are close to those obtained in subjective evaluation by human judges.

Full Text