Abstract

In the last few years, person re-identification (re-id) has made significant progress in supervised learning for matching pedestrians across disjoint camera views in surveillance. However, it is infeasible in many new scenes without sufficient labeled images when extending a re-id system. Therefore, unsupervised methods in person re-id tasks are vital for saving labeling costs. However, cross-camera scene variation is a crucial challenge for unsupervised person re-id, such as the occlusion problem. It results in uneven pairwise similarity distributions, which degrade matching performance. To solve this issue, we propose a local manifold consistency learning (LMCL) framework that consists of a context-aware feature embedding network and a camera-aware manifold alignment strategy. To better extract comprehensive features of persons in images, we propose a saliency feature attention algorithm by cropping feature maps into regions and transforming them into context features. We optimize our model based on sub-domain alignment loss to alleviate the effect of cross-camera scene variation, which closes the distance between sub-domains composed of similar samples. Extensive experimental results and ablation experiments verify the effectiveness of our LMCL approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call