Abstract

Using two key-frames with the neutral expression and maximum expression intensity respectively in face image sequence to alleviate inter-subject variations for Facial Expression Recognition (FER) has become the research focus of computer vision. To determine the two key-frames, an automatic Generation Difference Convolutional Neural Network (GDCNN) framework was presented to reduce the influences of high inter-subject variations caused by individual differences. Firstly, for any given input expression, a trained conditional Generative Adversarial Network (cGAN) is utilized to generate the frame with neutral expression while keeping the identity-related information. Secondly, the fine-tuned triplet distance model is used to detect the frame with the maximum expression intensity. Finally, an optimized two-stream CNN model is designed to reduce the influences of inter-subject variations and trained to extract the differential emotion features for FER. Extensive comparisons were performed on the CK+, MMI, and Beihang University databases and the results demonstrated the superiority of the proposed GDCNN framework over the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call