Abstract

Person re-identification is the task of matching pedestrian images across a network of non-overlapping camera views. It poses aggregated challenges resulted from random human pose, clutter from the background, illumination variations, and other factors. There has been a vast number of studies in recent years with promising success. However, key challenges have not been adequately addressed and continue to result in sub-optimal performance. Attention-based person re-identification gains more popularity in identifying discriminatory features from person images. Its potential in terms of extracting features common to a pair of person images across the feature extraction pipeline has not been be fully exploited. In this paper, we propose a novel attention-based Siamese network driven by a mutual-attention module decomposed into spatial and channel components. The proposed mutual-attention module not only leads feature extraction to the discriminative part of individual images, but also fuses mutual features symmetrically across pairs of person images to get informative regions common to both input images. Our model simultaneously learns feature embedding for discriminative cues and the similarity measure. The proposed model is optimized with multi-task loss, namely classification and verification loss. It is further optimized by a learnable mutual-attention module to facilitate an efficient and adaptive learning. The proposed model is thoroughly evaluated on extensively used large-scale datasets, Market-1501 and Duke-MTMC-ReID. Our experimental results show competitive results with the state-of-the-art works and the effectiveness of the mutual-attention module.

Highlights

  • Person re-identification task aims at making a correspondence between pedestrian images across non-overlapping camera views captured at different times

  • We presented a deep mutual-attention layer based person re-identification model framed as identification and verification tasks

  • Our proposed model was comprised of a mutual-attention layer that bridged between two branches of the feature extraction layer in relating spatially active regions across the inputs and boosting them to favor an effective similarity judgment in the subsequent layers

Read more

Summary

Introduction

Person re-identification task aims at making a correspondence between pedestrian images across non-overlapping camera views captured at different times It is a key task for surveillance systems and applications involving human-computer interaction. Given an input image, called the query image from one camera, the goal is to retrieve an image or sets of images from a different camera, called the gallery set, based on the similarity to the query image It became a problem of great interest bearing great challenges due to the fact that the appearance of a person keeps on changing across different camera views due to the aggregated effect of variations resulting from the change in light, pose, occlusion, view point, and even in some cases a pedestrian undergoing an instantaneous change such as a change of clothing, carrying bag, or putting on a cap. Some variations/situations not Symmetry 2020, 12, 358; doi:10.3390/sym12030358 www.mdpi.com/journal/symmetry

Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call