Point-level feature learning based on vision transformer for occluded person re-identification

Hua Gao,Chenchen Hu,Guang Han,Jiafa Mao,Wei Huang,Qiu Guan

doi:10.1016/j.imavis.2024.104929

Abstract

Person re-identification is challenging due to the presence of variations in pose and occlusion, which significantly impact the matching of visual features across different camera views and pose considerable difficulty for accurate person re-identification. This paper proposes a novel method for occluded person re-identification by introducing point-level feature learning based on vision transformers. Our approach utilizes a pose estimator to detect the keypoints of the human body and employs these points to locate intermediate features. These intermediate features of keypoints are input to a pose-based transformer branch to learn point-level features. Then, we design a part-based transformer branch to learn part-level features that capture visual features of different image parts, further enhancing the discriminative power of the learned features. Additionally, we employ a global branch to learn the global-level feature by treating the person's image as a single entity. Finally, we integrate point-level, part-level, and global-level features to represent a person's features. The experimental results on occluded and partial person re-identification datasets demonstrate the effectiveness of our proposed approach in improving re-identification. Our approach shows potential for improving person re-identification in scenarios with occlusion and pose variations.

Full Text