Abstract

The convolutional neural network (CNN) works very well in many computer vision tasks including the face-related problems. However, in the case of age estimation and facial expression recognition (FER), the accuracy provided by the CNN is still not good enough to be used for the real-world problems. It seems that the CNN does not well find the subtle differences in thickness and amount of wrinkles on the face, which are the essential features for the age estimation and FER. Also, the face images in the real world have many variations due to the face rotation and illumination, where the CNN is not robust in finding the rotated objects when not every possible variation is in the training data. To alleviate these problems, we first propose to use the Gabor filter responses of faces as the input to the CNN, along with the original face image. This method enhances the wrinkles on the face so that the face-related features are found in the earlier stage of convolutional layers, and hence the overall performance is increased. We also adopt the idea of capsule network, which is shown to be robust to the rotation of objects and be able to capture the relationship of facial landmarks. We show that the performance of age estimation and FER are improved by using the capsule network than using the plain CNNs. Moreover, by using the Gabor responses as the input to the capsule network, the overall performances of face-related problems are increased compared to the recent CNN-based methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call