Interact, Embed, and EnlargE: Boosting Modality-Specific Representations for Multi-Modal Person Re-identification

Zi Wang,Jin Tang,Chenglong Li,Ran He,Aihua Zheng

doi:10.1609/aaai.v36i3.20165

Abstract

Multi-modal person Re-ID introduces more complementary information to assist the traditional Re-ID task. Existing multi-modal methods ignore the importance of modality-specific information in the feature fusion stage. To this end, we propose a novel method to boost modality-specific representations for multi-modal person Re-ID: Interact, Embed, and EnlargE (IEEE). First, we propose a cross-modal interacting module to exchange useful information between different modalities in the feature extraction phase. Second, we propose a relation-based embedding module to enhance the richness of feature descriptors by embedding the global feature into the fine-grained local information. Finally, we propose multi-modal margin loss to force the network to learn modality-specific information for each modality by enlarging the intra-class discrepancy. Superior performance on multi-modal Re-ID dataset RGBNT201 and three constructed Re-ID datasets validate the effectiveness of the proposed method compared with the state-of-the-art approaches.

Full Text