FDFST: facial structure constrained landmark detection using visual transformer

Qianyu Zhou,Xuliang Li,Jiquan Ma,Yanxin Wang,Lidan Wang,Mengyi (Milly) Cen

doi:10.1117/12.2640359

Abstract

Facial landmark detection is challenging with occlusion, pose or inadequate training samples. We proposed a two-branch facial landmark detection network (Facial Detection with Face Structure and Transformers: FDFST) considering face structure constraints. Existing regression-based facial landmark detection models have not fully considered the general facial structure for landmark detection, that usually lead a unstable prediction. In contrast to facial landmarks, facial structure is more likely to be accurately estimated in the real scenario. Therefore, we try to provide a facial structure guidance for the facial landmark detection by a facial structure estimation sub-network. In this way, two targets are predefined to supervise our model, one is the facial structure described by five landmarks, the other is facial landmark denoted by 96 points. To address the lack of occlusion samples, we proposed a novel data augmentation to boost the training process on the public data sets WFLW. Experiments have revealed that our FDFST network on the WFLW dataset achieved significant improvement.

Full Text