Abstract

The emergence of a variety of applications aimed at video gaming, parental control, education, specific language impairment, child development assessment, and speech therapy create demands for age-targeted approaches. Yet, there is a lack of methods providing robust and easily interpretable age estimation of speakers from early childhood to post-pubertal stage. This study aims to provide a fully-automated approach for children's age prediction based on voice acoustics. Sustained phonation of vowels /a/, /e/, /i/, /o/, and /u/ recorded from 255 speakers (132 girls and 123 boys) ranging between 4 and 15 years of age were analysed. The first three formant frequencies and fundamental frequency across each vowel were automatically evaluated and used as features for linear and nonlinear regressors to estimate the prediction model. We demonstrate rapid, accurate age estimation with reasonable accuracy of an average 1.3-year difference from actual children's chronological age. The lower age prediction error of 1.2 years was achieved for boys compared to 1.5 years for girls. The early childhood age from 4 to 5 years was less accurate for prediction. No effect of utterance duration on estimated results was observed. Our results present a robust technology with clinically interpretable outcomes insusceptible to overfitting that enables to predict children's age in a wide range of ages. Better prediction accuracy for boys than girls appears to reflect the faster vocal tract growth for men. The lower prediction accuracy in early childhood can be attributed to rapid nonlinear development and greater variability in the level of motor control maturation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call