RGB-infrared cross-modality person re-identification is a challenging image retrieval task due to the intra-class variations and cross-modality discrepancy. However, it is insufficient for existing methods to identify discriminative patterns with different granularities and align them across granularity and modalities in a synergistic manner. In this paper, we propose Dual-granularity Feature Alignment (DFA) approach for cross-modality re-ID, which accommodates dual-granularity feature extraction and cross-granularity feature alignment. More specifically, a feature extractor is introduced to generate discriminative patterns by dividing feature map into different granularities. Based on the above features, prototype learning is developed to assign different granularity features of the same identity to the same prototype and adaptively build the alignment across different granularities. To mine contextual relationships between training samples, a similarity inference is presented to enforce the similarity matrix closing to identity matrix. This allows model to find feature that conserves as much modality-shared information as possible while being least possible modality-specific information. Both modules work in a collaborative way to obtain the optimal intra-class compactness and inter-class separability for cross-modality image matching. Extensive experiments on several benchmarks demonstrate that our DAF surpasses the state-of-the-art methods. The code is available at https://github.com/PRIS-CV/DFA.
Read full abstract