Text-Independent Speaker Identification Using the Histogram Transform Model

Zhanyu Ma,Hong Yu,Zheng-Hua Tan,Jun Guo

doi:10.1109/access.2016.2646458

Zhanyu Ma, Hong Yu + Show 2 more

Open Access

https://doi.org/10.1109/access.2016.2646458

Copy DOI

Abstract

In this paper, we propose a novel probabilistic method for the task of text-independent speaker identification (SI). In order to capture the dynamic information during SI, we design super-mel-frequency cepstral coefficients (MFCCs) features by cascading three neighboring MFCCs frames together. These super-MFCC vectors are utilized for probabilistic model training such that the speaker's characteristics can be sufficiently captured. The probability density function (PDF) of the aforementioned super-MFCCs features is estimated by the recently proposed histogram transform (HT) method. To recede the commonly occurred discontinuity problem in multivariate histograms computing, more training data are generated by the HT method. Using these generated data, a smooth PDF of the super-MFCCs vectors is obtained. Compared with the typical PDF estimation methods, such as Gaussian mixture model, promising improvements have been obtained by employing the HT-based model in SI.

Full Text