Abstract

Local features contain crucial clues for face anti-spoofing. Convolutional neural networks (CNNs) are powerful in extracting local features, but the intrinsic inductive bias of CNNs limits the ability to capture long-range dependencies. This paper aims to develop a simple yet effective framework that is versatile in extracting both local information and long-range dependencies for face anti-spoofing. To this end, we propose a novel architecture, namely Conv-MLP, which incorporates local patch convolution with global multi-layer perceptrons (MLP). Conv-MLP breaks the inductive bias limitation of traditional full CNNs and can be expected to better exploit long-range dependencies. Furthermore, we design a new loss specifically for the face anti-spoofing task, namely moat loss. The moat loss benefits discriminative representations learning and can improve the generalization capability on unseen presentation attacks. In this work, multi-modal data are directly fused at the signal level to extract complementary features. Extensive experiments on single and multi-modal datasets demonstrate that Conv-MLP outperforms existing state-of-the-art methods while being more computationally efficient. The code is available at https://github.com/WeihangWANG/Conv-MLP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call