Abstract

Person re-identification has been a significant application in the field of video surveillance analysis, yet it remains a challenging work to recognize the person of interest across disjoint cameras of different viewpoints. The factors affecting the identification results include the variation in background, different illumination conditions and the changes of human body poses. Existing person re-identification methods mainly focus on the feature extraction of the whole frame and metric learning functions. However, most of those algorithms treat different areas without distinction. It is worth emphasizing that different local regions make different contributions to image representaion, which exactly conforms to the attention mechanism. In this paper, we introduce a novel attention network which explores spatial attention in a convolutional neural network. Our algorithm learns the visual attention in multi-layer feature maps. The proposed model not only pays attention to the spatial probabilities of local regions, but also takes the features in different levels into consideration. We evaluate this multi-layer spatial attention model on three benchmark person re-identification datasets: Market-1501, CUHK03, and DukeMTMC-reID. The experiment results validate the advances of our adopted network by comparing with state-of-the-art baselines.

Highlights

  • Person re-identification (Re-ID) task, as an indispensable part of video behaviour analysis field, has received widespread attention

  • Our contributions are: (1) We propose a spatial attention-based convolutional neural network for person re-identification task

  • The algorithms consist of feature extraction methods, i.e. coRrelation Aware Feature augmenTation (CRAFT) [28], GLAD [41], Zhao et al [42], metric learning methods, i.e. SCSP [43], DNS [26], and deep network-based methods, i.e. Comparative Attention Network (CAN) [37], PIE+Kissme [44], PDC [45]

Read more

Summary

Introduction

Person re-identification (Re-ID) task, as an indispensable part of video behaviour analysis field, has received widespread attention. Person re-identification problem has broad potential application prospects in many occasions, especially the security systems. It remains a challenging job since the same person changes a lot under different shooting conditions. When dealing with person re-identification task, given an image captured by Camera A (probe image), it is compared with all images which come from Camera B (gallery images). The results are ranked according to the degree of similarity between the probe image and gallery ones. In order to achieve good performance, two steps are essentially important: (i) extract features that better describe the images; (ii) find a proper similarity measurement. Regarding to the similarity learning, several metric learning methods [5, 2, 6] have been proposed to learn a feature space in which the calculated distance of the feature vectors belong to the same person are smaller than those belong to different pedestrians

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.