Comparison of one and two-level architecture of the GMM-based speaker age classifier

Jiri Pribil,Jindrich Matousek,Anna Pribilova

doi:10.1109/tsp.2016.7760883

Abstract

The paper describes an experiment using the Gaussian mixture models (GMM) for automatic classification of the speaker age and gender. The developed two-level architecture is compared with the standard one-level GMM classifier in more detail analysing the influence of different number of mixtures and different types of speech features used for GMM gender/age classification and also regarding the computational complexity in dependence on the applied number of used mixtures. Finally, the GMM classification accuracy is compared with the evaluation using the conventional listening test method. The obtained summary results of 92.3 % mean age classification accuracy for the proposed two-level architecture are better than those for the one-level standard architecture (78.7 %) as well as for evaluation by the listening test method (74.6 %). However, the computation complexity in two levels is about twice higher than in one level, either for GMM model creation or for classification phases.

Full Text