Accuracy improvement of facial expression recognition in speech acts: Confirmation of effectiveness of information around a mouth and GAN-based data augmentation

Kyu-Seob Song,Dong-Soo Kwon

doi:10.1109/ro-man46459.2019.8956436

Abstract

With the growth of the social robot market, much research has been undertaken on facial expression recognition, which is an important function of a social robot. Facial expression recognition models have shown good performance in a facial expression image dataset that expresses emotion without considering speaking effect. However, in reality, humans often express emotions by speaking and moving the muscles around the mouth. Therefore, the lack of consideration of speech leads to unsatisfactory emotion recognition results. In this paper, we investigated two points to be considered in learning a facial expression recognition model. First, we confirmed whether the information around a mouth induces the recognition model in speech act to misrecognition like the case of a facial expression recognition in non-speech acts or it has valid information for facial expression recognition. Second, Generative Adversarial Network (GAN)-based data augmentation has been performed to cover the problem in which the accuracy of the recognition model in speech acts is low because of the relatively small variance about the subject in RML dataset. The results showed that the information around the mouth made facial expression recognition in speech acts exhibit higher performance, unlike the case of facial expression recognition in non-speech acts. In addition, the GAN-based data augmentation alleviated the accuracy degradation in facial expression recognition because of the low variance of the dataset.

Full Text