Abstract
Facial action unit (AU) recognition remains a challenging task, due to the subtlety and non-rigidity of AUs. A typical solution is to localize the correlated regions of each AU. Current works often predefine the region of interest (ROI) of each AU via prior knowledge, or try to capture the ROI only by the supervision of AU recognition during training. However, the predefinition often neglects important regions, while the supervision is insufficient to precisely localize ROIs. In this paper, we propose a novel AU recognition method by prior and adaptive attention. Specifically, we predefine a mask for each AU, in which the locations farther away from the AU centers specified by prior knowledge have lower weights. A learnable parameter is adopted to control the importance of different locations. Then, we element-wise multiply the mask by a learnable attention map, and use the new attention map to extract the AU-related feature, in which AU recognition can supervise the adaptive learning of a new attention map. Experimental results show that our method (i) outperforms the state-of-the-art AU recognition approaches on challenging benchmark datasets, and (ii) can accurately reason the regional attention distribution of each AU by combining the advantages of both the predefinition and the supervision.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.