Age estimation in Japanese speech based on feature selection

Atsushi Morimoto,Yoichi Yamashita,Masahiro Niitsuma

doi:10.1121/1.4971116

Abstract

This paper addresses age estimation of Japanese speech utterance, using paralinguistic information in speech. Our speech conveys not only linguistic information but also paralinguistic and non-linguistic information. Nonlinguistic information includes potentially valuable social information such as personal parameters, body conditions, gender and ages. This significant information, however, has not been investigated enough. The main purpose of our study was to develop less-invasive methods to extract this information, by proposing a method to classify speakers’ age into five groups (20s, 30s, 40s, 50s, and 60s). 6375-dimentional audio features were extracted from 1579 samples of 913 male speakers extracted from CSJ (Corpus of Spontaneous Japanese) and feature selection was conducted based on Fisher's Ratio. Naive Bayes with top 20 parameters of Fischer’s ratio, yields highest accuracy rate of 51.2%.

Full Text