Increasing the Accuracy of Automatic Speaker Age Estimation by Using Multiple UBMs

Azam Bastanfard,Mahdi Hasani,Dariush Amirkhani

doi:10.1109/kbei.2019.8735005

Abstract

In the recent years, many studies have been conducted and many programs and methods have been presented in the field of age estimation. However, the accuracy of the proposed methods remains a challenge in this field. One of the proposed methods for age estimation, which has a higher accuracy than other available methods, is the use of i-vector for automatic age estimation of the sound signal. This method uses one UBM for Gaussian Mixture model. In this paper, by increasing the number of UBMs, the Gaussian Mixture model is optimized and the accuracy of age estimation is improved. Using multiple UBMs with different Gaussian components, for each speaker multiple i-vectors corresponding to each UBM are extracted. Given that, the age of the person is estimated several times and the average of all the estimations is taken as the age of the individual. Also, the results of many experiments in the age estimation, show that PLP features can increase the accuracy of the individuals estimated age. So the second suggestion is to use this features in age estimation. Finally, to enhance the distinction of voice features, the mapping of features was introduced to a new environment; the mapping relationship to this new environment was obtained by training a deep belief network. The proposed algorithm was tested on the NIST 2004, NIST 2005 and NIST 2008 databases. In contrast to single UBM method, the results for the proposed method show a Pearson correlation of 0.8 and a mean absolute error of 5.14, suggesting a %6.67 and %17 improvements in Pearson correlation and mean absolute error, respectively.

Full Text