Abstract

Facial expression recognition (FER) is an extremely challenging task under unconstrained conditions. Especially, variant head poses degrade the performance dramatically due to the large variations in appearance of facial expressions. To address this problem, we propose a local attention network (LAN), which adaptively captures the important facial regions according to pose variations. The LAN emphasizes on more attentive regions while suppressing the regions not differentiated between classes. To find out attentive regions, we propose a simple yet efficient coarse-level attention guidance map annotation method in an unsupervised manner. The guidance map includes attention values for regions based on whether features are deformed by facial poses. Further, the attentive regional features obtained by our LAN and original global features are combined for pose-invariant FER. We validate our method on a controlled multiview dataset, KDEF, three popular in- the-wild datasets, RAF-DB, FERPlus, and AffectNet, and their subsets that contain images under pose variation conditions. Extensive experiments show that our LAN largely improves the performance of FER under pose variations. Our method also performs favorably against the previous methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call