Person search via class activation map transferring

Ruilong Li,Shuangwei Liu,Yunzhou Zhang,Shangdong Zhu

doi:10.1007/s11042-021-10863-7

Abstract

The methods to tackle person search problem can be divided into two categories. One is to train an end-to-end person search model to search target person from scene images. The other is to train a detection model and a re-identification (re-ID) model, which are then cascaded to locate and crop persons in scene images and find target person from cropped person images. Training a detection model and a re-ID model separately to achieve person search can avoid the conflict of optimizing different losses in multi-task learning. However, the cascading solutions usually cost more time and have more parameters than the end-to-end solutions. To take advantages and avoid disadvantages of cascading person search methods, we intend to use the knowledge distillation method to teach the end-to-end person search model by using the Class Activation Map of the well-trained person re-ID model as an auxiliary supervise signal and loading well-trained pedestrian detection as a pre-trained model. Besides, we adjust the spatial size of the feature map and select Resnet models to make the student model have higher performance or faster inference speed. Experimental results show that the mAP performance of our framework outperforms the state-of-the-art methods on the PRW dataset.

Full Text