Abstract

In recent years, face parsing and facial expression recognition have attracted increasing interest. Even though there are relevant results about face parsing and face representation, these approaches seek accuracy at the expense of speed. In this paper, we design a novel multi-task learning network for face parsing and facial expression recognition (MPENet). Specifically, MPENet consists of shared encoders and three downstream branches. In the edge perceiving branch, we use category edge and binary edge to extract face boundary information and improve localization of face boundaries. In the segmentation branch, we use graph learning to fuse edge and semantic information of the image, analyze the relations between different feature regions, and capture more contextual relationships. Finally, we design a consistent learning loss function, forcing different branches to learn the same predictions. We have carried out experiments on face datasets, and found that it shows high precision and fast inference speed. Specifically, MPENet achieves F1 scores of 85.9 on the CelebAMask-HQ dataset and 92.9 on the Lapa dataset, with an inference speed of 92.9 FPS. Moreover, MPENet precisely delineates the semantic boundaries of facial regions and, through consistent multi-task learning, effectively facilitates synergy among various tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.