Abstract

Facial expression recognition is an important part of computer vision and has attracted great attention. Although deep learning pushes forward the development of facial expression recognition, it still faces huge challenges due to unrelated factors such as identity, gender, and race. Inspired by decomposing an expression into two parts: neutral component and expression component, we define residual features and propose an end-to-end network framework named Expression Removal and Recognition Network (ERR-Net), which can simultaneously perform expression removal and recognition tasks. The residual features are represented in two ways: pixel level and facial landmark level. Our network focuses on interpreting the encoder’s output and corresponding its segments to expressions to maximize the inter-class distances. We explore the improved generative adversarial network to convert different expressions into neutral expressions (i.e., expression removal), take the residual images as the output, learn the expression components in the process, and realize the classification of expressions. Through sufficient ablation experiments, we have proved that various improvements added on the network have obvious effects. Experimental comparisons on two benchmarks CK+ and MMI demonstrate that our proposed ERR-Net surpasses the state-of-the-art methods in terms of accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.