Abstract

Although person re-identification (ReID) has drawn increasing research attention due to its potential to address the problem of analysis and processing of massive monitoring data, it is very challenging to learn discriminative information when the people in the images are occluded, in large pose variations or from different perspectives. To address this problem, we propose a novel joint attention person ReID (JA-ReID) architecture. The idea is to learn two complementary feature representations by combining a soft pixel-level attention mechanism and a hard region-level attention mechanism. The soft pixel-level attention mechanism learns a discriminative embedding for the fine-grained information by exploring the salient parts in the feature maps. The hard region-level attention mechanism conducts uniform partitions on the convolutional feature maps for learning local features. We have achieved competitive results in three popular benchmarks, including Market1501, DukeMTMC-reID, and CUHK03. The experimental results verify the adaptability of the joint attention mechanism to non-rigid deformation of the human body, which can effectively improve the accuracy of ReID.

Highlights

  • Person re-identification (ReID) aims to tell whether a person can be found in other non-overlapping surveillance camera views by matching person images [1]

  • The experimental results verify the adaptability of the joint attention mechanism to non-rigid deformation of the human body, which can effectively improve the accuracy of ReID

  • The experimental results verify the adaptability of the proposed joint attention person ReID (JA-ReID) architecture to non-rigid deformation of the human body, which can effectively improve the accuracy of ReID

Read more

Summary

INTRODUCTION

Person re-identification (ReID) aims to tell whether a person can be found in other non-overlapping surveillance camera views by matching person images [1]. The soft pixel-level attention mechanism can automatically localize the most activated part in the feature maps by aggregating all the pixels cross-channels into one feature map and getting the largest connected component The advantage of this method is that it can remove background noise and less distinctive parts of the image without learning parameters, which benefits the ReID problem. The main contributions of this paper are as follows: (1) We propose a soft pixel-level attention mechanism, which can get fine-grained information of the image. (2) A novel Joint Attention person re-identification architecture (JA-ReID) is proposed by combining the soft pixel-level attention and hard region-level attention, which can maximize the correlated complementary information. The experimental results verify the adaptability of the joint attention mechanism to non-rigid deformation of the human body, which can effectively improve the accuracy of ReID. V gives a brief summary and discussion of our work

RELATED WORK
SOFT PIXEL-LEVEL ATTENTION LEARNING
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call