Abstract

Language-independent and alignment-free phonological and phonemic features were applied for automatic age estimation based on voice and speech properties. 110 persons average: 75.7 years read the German version of the text "The North Wind and the Sun". For comparison with the automatic approach, five listeners estimated the speakers' age perceptually. Support Vector Regression and feature selection were used to compute the best model of aging. This model was found to use the following features: a the percentage of voiced frames, b eight phonological features, representing vowel height, nasality in consonants, turbulence, and position of the lips, and finally, c seven phonemic features. The latter features might be relevant due to altered articulation because of dentures. The mean absolute error between computed and chronological age was 5.2 years RMSE: 7.0. It was 7.7 years RMSE: 9.6 for an optimistic trivial estimator and 10.5 years RMSE: 11.9 for the average listener.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call