To overcome the technical bottleneck of face recognition in low-light scenarios, Near-InfraRed and VISible (NIR-VIS) heterogeneous face recognition is proposed for matching well-lit VIS faces with poorly lit NIR faces. Current cross-modal synthesis methods visually convert the NIR modality to the VIS modality and then perform face matching in the VIS modality. However, using a heavyweight GAN network on unpaired NIR-VIS faces may lead to high synthesis difficulty, low inference efficiency, and other problems. To alleviate the above problems, we simultaneously synthesize NIR and VIS images into modality-independent syncretic images and propose a novel syncretic space learning (SSL) model to eliminate the modal gap. First, Syncretic Modality Generator (SMG) synthesizes NIR and VIS images into syncretic images using channel-level convolution with a shallow CNN. In particular, the discriminative structural information is well preserved and the face quality can be further improved with small modal variations in a self-supervised learning manner. Second, Modality-adversarial Syncretic space Learning (MSL) projects NIR and VIS images into the syncretic space by a syncretic-modality adversarial learning strategy with syncretic pattern guided objective, so the modal gap of NIR-VIS faces can be effectively reduced. Finally, the Syncretic Distribution Consistency (SDC) constructed by NIR-syncretic, syncretic-syncretic, and VIS-syncretic consistency can enhance the intra-class compactness and learn discriminative representations. Extensive experiments on three challenging datasets demonstrate the effectiveness of the SSL method.
Read full abstract