Cross-modality visible-infrared person reidentification (VI-ReID), which aims to retrieve pedestrian images captured by both visible and infrared cameras, is a challenging but essential task for smart surveillance systems. The huge barrier between visible and infrared images has led to the large cross-modality discrepancy and intraclass variations. Most existing VI-ReID methods tend to learn discriminative modality-sharable features based on either global or part-based representations, lacking effective optimization objectives. In this article, we propose a novel global-local multichannel (GLMC) network for VI-ReID, which can learn multigranularity representations based on both global and local features. The coarse- and fine-grained information can complement each other to form a more discriminative feature descriptor. Besides, we also propose a novel center loss function that aims to simultaneously improve the intraclass cross-modality similarity and enlarge the interclass discrepancy to explicitly handle the cross-modality discrepancy issue and avoid the model fluctuating problem. Experimental results on two public datasets have demonstrated the superiority of the proposed method compared with state-of-the-art approaches in terms of effectiveness.
Read full abstract