Abstract

Facial landmark detection is challenging with occlusion, pose or inadequate training samples. We proposed a two-branch facial landmark detection network (Facial Detection with Face Structure and Transformers: FDFST) considering face structure constraints. Existing regression-based facial landmark detection models have not fully considered the general facial structure for landmark detection, that usually lead a unstable prediction. In contrast to facial landmarks, facial structure is more likely to be accurately estimated in the real scenario. Therefore, we try to provide a facial structure guidance for the facial landmark detection by a facial structure estimation sub-network. In this way, two targets are predefined to supervise our model, one is the facial structure described by five landmarks, the other is facial landmark denoted by 96 points. To address the lack of occlusion samples, we proposed a novel data augmentation to boost the training process on the public data sets WFLW. Experiments have revealed that our FDFST network on the WFLW dataset achieved significant improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call