Abstract

As supervised person re-identification (Re-Id) requires massive labeled pedestrian data and it is very difficult to collect sufficient labeled data in reality, unsupervised Re-Id approaches attract much more attention than the former. Existing unsupervised person Re-Id models learn global features of pedestrian from whole images or several constant patches. These models ignore the difference of each region in the whole pedestrian images for feature representation, such as occluded and pose invariant regions, and thus reduce the robustness of models for cross-view feature learning. To solve these issues, we propose an Unsupervised Region Attention Network (URAN) that can learn the cross-view region attention features from the cropped pedestrian images, fixed by region importance weights on images. The proposed URAN designs a Pedestrian Region Biased Enhance (PRBE) loss to produce high attention weights for most important regions in pedestrian images. Furthermore, the URAN employs a first neighbor relation grouping algorithm and a First Neighbor Relation Constraint (FNRC) loss to provide the training direction of the unsupervised region attention network, such that the region attention features are discriminant enough for unsupervised person Re-Id task. In experiments, we consider two popular datasets, Market1501 and DukeMTMC-reID, as evaluation of PRBE and FNRC loss, and their balance parameter to demonstrate the effectiveness and efficiency of the proposed URAN, and the experimental results show that the URAN provides better performance than the-state-of-the-arts (higher than existing methods at least 1.1%).

Highlights

  • Person re-identification (Re-Id) is widely regarded as a retrieval problem, which is to determine whether there is a specific person in non-overlapped images or video sequences

  • (3) We conduct a series competitive experiments on two public person Re-Id datasets for Unsupervised Region Attention Network (URAN) and analysis the influences of Pedestrian Region Biased Enhance (PRBE) and First Neighbor Relation Constraint (FNRC) losses, while the results prove the superiority of our URAN approach over the-state-of-thearts, and demonstrate the significance of PRBE and FNRC losses

  • RELATED WORK we review and summarize three category researches that are related to our URAN approach, including supervised person Re-Id, unsupervised person Re-Id, and attention networks

Read more

Summary

Introduction

Person re-identification (Re-Id) is widely regarded as a retrieval problem, which is to determine whether there is a specific person in non-overlapped images or video sequences. UMDL [22] is a multi-task dictionary learning approach which is able to learn a data-shared but target-data-biased representation; PUL [8] is a progressive unsupervised learning method which trained by iterating between pedestrian clustering and fine-tuning of the convolutional neural network; CAMEL [41] aims to learn an asymmetric distance metric for each view and finds a shared space where view-specific bias is alleviated; PTGAN [36] is proposed for bridging the domain gap between source and target data by a person transfer generative adversarial network; SPGAN [7] introduces the similarity preserving cycle consistent generative adversarial network into an unsupervised domain adaptation approach which generates images for effective target-domain learning; MAR [43] is a deep learning model for the soft multilabel learning for unsupervised person Re-Id task, achieved by the soft multilabel-guided hard negative mining to learn a discriminative embedding for the unlabeled target domain by exploring the similarity consistency of the visual features and the soft multilabels of unlabeled target pairs; PAUL [40] is a patch-based unsupervised learning framework to learn leverages similarity between patches to learn a discriminative model; ECN [50] introduces an exemplar memory to store features of the target domain and accommodates the three invariance properties, which enforce constraints over global training batch without significantly increasing computational cost.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.