Abstract

Recently, part-based person re-identification methods attract lots of attention and largely improve the accuracy. However, due to the large variations in camera occlusion, pose change and misalignment, the corresponding part regions of different images from a same person may miss the key cues. In this paper, we proposed a local to global with multi-scale attention network (LGMANet), which sufficiently exploits the contextual information and spacial attention information. Our proposed model includes two branches. One is local to global branch. By pooling operation, an image generates the feature maps of different dimensions. Then, we learn local to global descriptors by partitioning these feature maps with the same scale. The other is multi-scale attention branch, which captures the contextual dependencies from different convolution layers and further improves the discriminative ability of the image feature. Experimental results demonstrate that our method achieves the state-of-the-art results on three benchmark datasets, Market-1501, DukeMTMC-reID and CUHK03.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call