DDAD: Detachable Crowd Density Estimation Assisted Pedestrian Detection

Wenxiao Tang,M Saad Shakeel,Kun Liu,Hao Wang,Wenxiong Kang

doi:10.1109/tits.2022.3222692

Abstract

Detecting pedestrians is a challenging computer vision task, especially in the intelligent transportation system. Mainstream pedestrian detection methods purely utilize information of bounding boxes, which overlooks the role of other valuable attributes (e.g., head, head-shoulders, and keypoints) of pedestrians and leads to sub-optimal solutions. Some works leveraged these valuable attributes with a minor performance improvement at the expense of increased computational complexity during the inference phase. To alleviate this dilemma, we propose a simple yet effective method, namely Detachable crowd Density estimation Assisted pedestrian Detection (DDAD), which leverages the crowd density attributes to assist pedestrian detection in the real-world scenes (e.g., crowded scenes and small-scale pedestrian scenes). The advantage of the crowd density estimation is that it allows the network to focus more on the human head and the small-scale pedestrians, which improves the features representation of pedestrians heavily occluded or far from cameras. Our DDAD works on a principle of multi-task learning and can be seamlessly applied to both one-stage and two-stage pedestrian detectors by equipping them with an extra detachable branch of crowd density estimation. The equipped crowd density estimation branch is trained with the annotations derived from the existing pedestrian bounding box annotations, occurring no extra annotation cost. Moreover, it can be removed during the inference phase without sacrificing the inference speed. Extensive experiments conducted on two challenging datasets, i.e., CrowdHuman and CityPersons, demonstrate that our proposed DDAD achieves a significant improvement upon the state-of-the-art methods. Code is available at https://github.com/SCUT-BIP-Lab/ DDAD.

Full Text