Abstract

Learning rich and discriminative person-related modality-shared feature representations to distinguish the same person in different modalities is significant for RGB-Infrared Person Re-IDentification (RGB-IR ReID). However, most existing models often directly extract modality-shared features from those modality-specific features without considering their interactions. This will result in the ineffective exploration of person-related modality-shared features. To address such a problem, a novel Modality-specific and Modality-shared Features Interaction Network (M2FINet) model is proposed for RGB-IR ReID in this paper. Especially, in the proposed M2FINet model, a Cross-level Feature Guidance and Injection (CFGI) module is carefully designed to establish and exploit the interactions between the middle-level modality-specific features and the high-level modality-shared features. Specifically, the proposed CFGI module mainly consists of two streams, a shared-to-specific feature guidance (H2P) stream and a specific-to-shared feature injection (P2H) stream. Among that, the H2P stream aims to take high-level modality-shared features as the prior information to guide the re-exploration of more rich and discriminative modality-shared semantic information from such middle-level modality-specific features. The P2H stream aims to enhance the representation ability and discriminability of high-level modality-shared features by introducing more modality-shared detail information from the middle-level modality-specific features. On top of that, a simple but effective feature aggregation module, i.e., Focusing on Person (FOP), is further designed in our proposed model to reinforce such discriminative modality-shared features within those person-related regions via a multi-pooling feature aggregation manner. Extensive experiments on two public benchmarks, i.e, SYSU-MM01 and RegDB, show that our proposed model consistently improves the accuracy of RGB-IR ReID. Without bells and whistles, it achieves Rank-1/mAP by 74.73%/68.96% on the large-scale SYSU-MM01 dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call