Abstract
Person search involves localizing and re-identifying persons of interest captured by multiple, non-overlapping cameras. Recent approaches to person search are typically built on object detection frameworks to learn joint person representations for detection and re-identification. To this end, the features extracted from pedestrian proposals are projected onto a unit hypersphere using L2 normalization, and positive proposals that sufficiently overlap with the ground truth are equally incorporated for training by exploiting an external lookup table (LUT). We have found that (1) using the L2 normalization technique, without considering feature distributions, can degenerate the discriminative power of person representations, (2) positive proposals often depict distracting details, such as background clutter and person overlaps, and (3) person features in the LUT are not often updated during training. To address these limitations, we propose a novel framework for person search, dubbed PLoPS, using a prototypical normalization layer, ProtoNorm, that calibrates features while considering the long-tail distribution across person IDs. PLoPS also entails a localization-aware learning scheme that prioritizes better-aligned proposals w.r.t the ground truth. We further introduce a LUT calibration technique to continuously adjust the person features in the LUT. Experimental results and analysis on standard benchmarks demonstrate the effectiveness of PLoPS.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.