Abstract

The paper describes an experiment using the Gaussian mixture models (GMM) for automatic classification of the speaker age and gender. The developed two-level architecture is compared with the standard one-level GMM classifier in more detail analysing the influence of different number of mixtures and different types of speech features used for GMM gender/age classification and also regarding the computational complexity in dependence on the applied number of used mixtures. Finally, the GMM classification accuracy is compared with the evaluation using the conventional listening test method. The obtained summary results of 92.3 % mean age classification accuracy for the proposed two-level architecture are better than those for the one-level standard architecture (78.7 %) as well as for evaluation by the listening test method (74.6 %). However, the computation complexity in two levels is about twice higher than in one level, either for GMM model creation or for classification phases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call