Abstract

Automatic recognition of facial images showing erotic expressions can help to understand our social interaction and to detect non-appropriate images even when there is no nakedness present in them. This paper contemplates, for the first time, to exploit facial cues applied to automatic Sexual Facial Expression Recognition (SFER). With this goal, we introduce a new dataset named Sexual Expression and Activity Faces (SEA-Faces-30k) for SFER, which contains 30k manually labeled images under three categories: erotic, suggestive-erotic, and non-erotic. Deep Convolutional Neural Networks require large-scale annotated image datasets with diversity and variations to be properly trained. Unfortunately, gathering such a massive amount of data is not feasible in this area. Therefore, we present a new semi-supervised GAN framework named Triple-BigGAN, which learns a generative model and a classifier simultaneously. It learns both tasks in an end-to-end fashion while using unlabeled or partially labeled data. The Triple-BigGAN framework shows promising classification performance for the SFER task (i.e., 93.59%) and other five benchmark datasets, i.e., FER-2013, CIFAR-10, Expression in-the-Wild (ExpW), Modified National Institute of Standards and Technology database (MNIST), and Street View House Numbers (SVHN). Next, we evaluated the quality of samples generated by Triple-BigGAN with a resolution of 256×256 pixels using Inception Score (IS) and Frechet Inception Distance (FID). Our approach obtained the best FID (i.e., 19.94%) and IS (i.e., 97.98%) scores on the SEA-Faces-30k dataset. Further, we empirically demonstrated that synthetic erotic face images generated by Triple-BigGAN could also help in improving the classification performance of deep supervised networks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call