Abstract

Video person re-identification is a crucial component of a robust surveillance system. Within a video clip, different human regions exhibit unique stability characteristics, which would be harmful to generating a discriminative representation. Unfortunately, prior works cannot effectively deal with the stability characteristics of different regions. To tackle this problem, we propose a Multiple Region Representation Network (MRRNet) that aims to discover the discriminative information from different human regions. Firstly, a Stable Region Representation (SRR) layer is proposed to capture important clues from the stable regions and exchange temporal information by cross-relation aware operation. Secondly, a Multiple Region Representation (MRR) layer is designed to address the unstable regions and preserve the attention on stable regions. Thirdly, SRR and MRR can be conveniently inserted into multiple stages of the deep residual networks and significantly improve the performance of the network. Comprehensive experiments validate the effectiveness of our network. Particularly, MRRNet achieves 86.7% mAP and 91.1% Rank-1 accuracy on the MARS dataset, which outperforms state-of-the-arts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.