Pose estimation and occlusion augmentation based vision transformer for occluded person re-identification

Y Wei,X Chen,D Niu,Z Xu,Y Dong,H Gong

doi:10.1049/icp.2023.0150

Abstract

Occluded person re-identification (ReID) is a challenging task as person images suffering from occlusion of various obstacles in real surveillance camera scene. Extracting partial feature from person image is crucial for occluded person ReID. In this paper, we propose the Pose Estimation and Occlusion Augmentation Based Vision Transformer (POVT) which leverage Pose Estimation Guided Vision Transformer (PEGVT) and an Occlusion Generation Module (OGM) to extract discriminative partial features. PEGVT divides the patch embeddings into different areas by pose estimation results and guide key point tokens to interact with corresponding patch embeddings to extract key point partial features. OGM can simultaneously generate realistic occlusion data which can improve the robustness of the ReID model, and occluded mask information which can supervise the finetune of pose estimation model to alleviate the performance degradation caused by domain gap. Experimental results over occluded re-identification datasets validate the effectiveness of the proposed POVT.

Full Text