One-way multimodal image-to-image translation for heterogeneous face recognition

Shulin Ji,Xingang Zhai,Jie Liu

doi:10.1117/1.jei.32.3.033029

Abstract

Cross-domain face matching is known as heterogeneous face recognition (HFR), which is more valuable than traditional face recognition and plays a key role in public safety. It remains a challenging problem due to insufficient heterogeneous data and large domain discrepancies. Recent work has focused on learning common features from different visual domains and synthesizing face images from other domains into the visible domain to reduce domain discrepancy. The former performs domain transformation during face recognition, which requires additional runtime. The latter has poorer performance limited by insufficient heterogeneous data. We propose a one-way multimodal image-to-image translation (OMIT) to tackle the problem of insufficient heterogeneous data. Specifically, we learn the generation of heterogeneous data from small-scale paired heterogeneous training data, and an identity-preserving loss is imposed on the generated heterogeneous images to ensure their identity consistency. Then, we transform large-scale visible data of abundant identity information into paired heterogeneous data, which addresses the lack of identity diversity due to small-scale heterogeneous data. The generated pairwise heterogeneous data can be directly used to train the HFR network to improve recognition accuracy and precision. Finally, our method exhibits generated quality improvements on BUAA, CASIA NIR–VIS 2.0, Oulu NIR–VIS, and Tufts face database. Heterogeneous datasets generated by OMIT improve the performance of HFR networks.

Full Text