Abstract

The key to efficient person search is jointly localizing pedestrians and learning discriminative representation for person re-identification (re-ID). Some recently developed models are built with separate detection and re-ID branches on top of shared region feature extraction networks. There are two factors that are detrimental to re-ID feature learning. One is the background information redundancy resulting from the large receptive field of neurons. The other is the body part missing and background clutter caused by inaccurate localization. In this work, a bottom-up fusion (BUF) subnet is proposed to fuse the bounding box features pooled from multiple network stages. With a few parameters introduced, BUF leverages the multi-level features with various sizes of receptive fields to mitigate the background-bias problem. To further suppress the non-pedestrian regions, the newly introduced segmentation head generates a foreground probability map as guidance for the network to focus on the foreground regions. The resulting foreground attention module (FAM) enhances the foreground features. Moreover, for robust feature learning in practical person search, we propose to adaptively smooth the labels of the pedestrian boxes with consideration of the detection quality. Extensive experiments on PRW and CUHK-SYSU validate the effectiveness of the proposals. Our Bottom-Up Foreground-Aware Feature Fusion (BUFF) network with ALS achieves considerable gains over the state-of-the-art on PRW and competitive performance on CUHK-SYSU.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.