Abstract

Cross-modality person re-identification task is a challenging task aiming to recognize images of the same identity between different modalities. To alleviate the cross-modality discrepancies between images, existing approaches mainly guide models to mine modality invariant features. Although those approaches are effective, they lose the modality-specific features that include important information beneficial to VI-ReID. Therefore, some approaches are using generative adversarial networks to compensate for modality information. However, the quality of images generated by these methods is usually poor, and most of them focus only on the learning of modality-sharable features. To solve these problems, this paper proposes a generative-based cross-modality image fusion strategy (GC-IFS), which can generate high-quality cross-modality paired images and fuse the information of the two modalities. Firstly, considering the importance of the identity discriminative information of the generated image, we propose a contrastive-learning image generation (CLIG) network to generate cross-modality paired images. Meanwhile, to fully integrate and utilize the information of the two modalities and eliminate the influence of cross-modality discrepancies, we design a part-based dual multi-modality feature fusion (P-DMFF) module to extract the unified feature representation. Extensive experiments on SYSU-MM01 and RegDB datasets demonstrate that our strategy outperforms the state-of-the-art methods for the VI-ReID task.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call