Person search is an extremely challenging task that seeks to identify individuals through joint person detection and person re-identification from uncropped real scene images. Previous studies primarily focus on learning rich features to enhance identification. However, arbitrary feature enhancement strategies may introduce unwanted background noise. Moreover, different scenarios usually exhibit varying pedestrian appearances or even intricate occlusions, leading to inconsistent/incomplete pedestrian features in different images. In this paper, we introduce a novel Attentive Multi-granularity Perception (AMP) module seamlessly integrated into our AMPN network. This module specifically addresses appearance variations and occlusions within a person's Region of Interest (RoI). The AMP module harnesses discriminative relationship features from various local regions, significantly enhancing identification accuracy. It comprises two principal components: the Pedestrian Perception Enhancement (PPE) block and the Background Interference Suppressor (BIS). The PPE block introduces a Spatial-wise Feature Mixer and a Channel-wise Feature Mixer, which effectively capture and refine discriminative relation features. Simultaneously, the BIS operates in parallel with the PPE block, enriching the discriminative relation features and enhancing the distinctiveness between the foreground and background. Our AMP module is plug-and-play and can integrate with other person search models. Extensive experiments validate our model's merits, achieving state-of-the-art performance on CUHK-SYSU and a 4.8% mAP gain over SeqNet on PRW at a desirable speed. Our code is accessible at https://github.com/zqx951102/AMPN.
Read full abstract