Abstract

This paper addresses age estimation of Japanese speech utterance, using paralinguistic information in speech. Our speech conveys not only linguistic information but also paralinguistic and non-linguistic information. Nonlinguistic information includes potentially valuable social information such as personal parameters, body conditions, gender and ages. This significant information, however, has not been investigated enough. The main purpose of our study was to develop less-invasive methods to extract this information, by proposing a method to classify speakers’ age into five groups (20s, 30s, 40s, 50s, and 60s). 6375-dimentional audio features were extracted from 1579 samples of 913 male speakers extracted from CSJ (Corpus of Spontaneous Japanese) and feature selection was conducted based on Fisher's Ratio. Naive Bayes with top 20 parameters of Fischer’s ratio, yields highest accuracy rate of 51.2%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call