Abstract

Occlusion scenarios pose a great challenge to person re-identification (ReID) task because various occlusions may weaken the discriminative features and introduce interference. Recently, Transformer-based networks, which can aggregate features of all the image patches to construct global features adaptively, have shown advantages in occluded person ReID. Existing methods mainly adopted Transformer as a feature extractor and enhanced local features from the output of the Transformer encoder. However, during the processing of self-attention blocks, disturbing features from occlusions may be diffused into all the tokens, making it difficult to construct effective local features. Therefore, we consider predicting the occlusion situation of images before feature extraction and guiding the Transformer encoder to focus on visible regions, suppressing interference from occlusion. Furthermore, we propose to imagine the partial target under occlusion and reconstruct pseudo-holistic features for more robust retrieval. To this end, the Occlusion Suppression and Repairing Transformer (OSRTrans) is proposed. First, we use a self-supervised occlusion predictor to predict occlusion scores of image patches. Then the Occlusion Suppression Encoder (OSE), guided by occlusion predictions, suppresses the interference from occlusion regions and constructs a global feature. Finally, inspired by contrastive learning, the Feature Repairing Head (FRH) is proposed to reconstruct pseudo-holistic features. Our method enhances model’s ability of extracting discriminative local features, and achieve the state-of-the-art performance on occluded person ReID benchmarks, e.g., Rank-1 of 72.9% on Occluded-DukeMTMC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call