Face anti-spoofing (FAS) is a crucial task in the field of face recognition practices, which aims to detect and prevent attempts to spoof/attack a facial recognition system using fake or manipulated images. In this work, we aimed to develop a novel Face Anti-Spoofing Identifier Network (FASIN) with Depth and Near Infrared (NIR) embeddings, trained on multi modalities using multi stream Convolutional Neural Networks (CNNs). The proposed FASIN model is capable of processing RGB, Depth, and NIR images to extract discriminative features and effectively distinguish between genuine and fake faces. The depth maps and the near infrared (NIR) images are acquired from RGB images by constructing the depth map construction network (DMCN) and near infrared construction network (NIRCN) respectively. The FASIN model comprises three sub-networks: one for processing RGB images, a second for processing depth images, and the other sub-network for processing NIR images. All the sub-networks consist of multiple CNN layers, which extract features at different scales and levels of abstraction. The inherent noise and other variables may reduce the efficacy of CNN, the wavelet spatial attention mechanism has been proposed to support the RGB CNN stream and it is named wavelet attention CNN (WA-CNN). The extracted features are then concatenated using a multi modal feature fusion module to obtain a robust feature representation that is used to classify real and fake faces. An ensemble learning mechanism has been attached to the model to learn the concatenated features effectively. Experimental results obtained on four benchmark datasets (namely, CelebA-Spoof, CASIA-SURF, WMCA, and MSU-MFSD) demonstrate the efficacy of the proposed FASIN model collated with the state-of-the-art methods. The proposed FASIN model achieves high accuracy and low average classification error rates (ACER), indicating its potential for real-world applications in face anti spoofing identification systems.
Read full abstract