As one of the most crucial parts of face detection, the accuracy and the generalization of face anti-spoofing are particularly important. Therefore, it is necessary to propose a multi-branch network to improve the accuracy and generalization of the detection of unknown spoofing attacks. These branches consist of several frequency map encoders and one depth map encoder. These encoders are trained together. It leverages multiple frequency features and generates depth map features. High-frequency edge texture is beneficial for capturing moiré patterns, while low-frequency features are sensitive to color distortion. Depth maps are more discriminative than RGB images at the visual level and serve as useful auxiliary information. Supervised Multi-view Contrastive Learning enhances multi-view feature learning. Moreover, a two-stage feature fusion method effectively integrates multi-branch features. Experiments on four public datasets, namely CASIA-FASD, Replay–Attack, MSU-MFSD, and OULU-NPU, demonstrate model effectiveness. The average Half Total Error Rate (HTER) of our model is 4% (25% to 21%) lower than the Adversarial Domain Adaptation method in inter-set evaluations.
Read full abstract