Novel third-order hidden Markov models for speaker identification in shouted talking environments

Ismail Shahin

doi:10.1016/j.engappai.2014.07.006

Abstract

Speaker identification systems perform almost perfectly in neutral talking environments; however, they perform poorly in shouted talking environments. This work aims at proposing, implementing, and evaluating novel models called Third-Order Hidden Markov Models (HMM3s) to enhance the poor performance of text-independent speaker identification systems in shouted talking environments. The proposed models have been evaluated on our collected speech database using Mel-Frequency Cepstral Coefficients (MFCCs). Our results show that HMM3s significantly improve speaker identification performance in shouted talking environments compared to second-order hidden Markov models (HMM2s) and first-order hidden Markov models (HMM1s) by 12.4% and 202.4%, respectively. The achieved results based on the proposed models are close to those obtained in subjective assessment by human listeners.

Full Text