Abstract
Person re-identification aims to retrieve specific pedestrians from different cameras and scenes, in which extracting robust and discriminative features is crucial for this task. To explore the potential interactions among images and learn more robust representations, this paper proposes Transformer-based Feature Interactor(TFI) and improved Margin Self-punishment Softmax loss(MS-Softamx). The Transformer-based Feature Interactor consists of Group Channel Pyramid Attention(GCPA) and Neighbor Interaction Modeling(NIM). Firstly, the Group Channel Pyramid Attention module provides prior information for high-level semantics via low-level semantics. The attention information is gradually stacked from coarse to fine to obtain enhanced hierarchical multi-scale features. Then, Neighbor Interaction Modeling effectively model the input and similar neighbors to produce a more robust and discriminative image representation. To make TFI more focused on intra-class embedding learning, we also propose Margin Self-punishment Softmax guide deep network learning, which obtains a tighter custom classification boundary by pushing the inter-class threshold and minimizing the intra-class variance. The proposed method is verified on four datasets, and this achieves 92.8%/95.6% mAP/Rank-1 on Market1501, 86.1%/90.8% mAP/Rank-1 on DukeMTMC, 64.4%/ 81.2% mAP/Rank-1 on MSMT17, 79.7%/80.8% mAP/Rank-1 on CUHK03-detected and 81.8%/81.9% mAP/Rank-1 on CUHK03-labeled. Extensive experiments demonstrate that the proposed method achieves competitive performance with other state-of-the-art methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have