Double Mask R-CNN for Pedestrian Detection in a Crowd

Congqiang Liu,Haosen Wang,Chunjian Liu,Yugen Yi

doi:10.1155/2022/4012252

Abstract

Aiming at the difficulty of feature extraction and the limitation of NMS (nonmaximum suppression) in crowded pedestrian detection, a new detection network named Double Mask R-CNN based on Mask R-CNN with FPN (Feature Pyramid Network) is proposed in this article. The algorithm has two improvements: firstly, we add a semantic segmentation branch on the FPN to strengthen the feature extraction of crowded pedestrians; secondly, we design a rule to estimate the pedestrian visibility of detected image according to the human keypoints information, and this rule can cover binary mask on the image whose pedestrian visibility is less than a certain threshold. Then we input the masked image into the network to locate occluded pedestrians. Experimental results on the CrowdHuman dataset show that the log-average miss rate (MR) of Double Mask R-CNN is 13, 12% lower than the best results of other mainstream networks. Similar improvements on WiderPerson dataset are also achieved by the Double Mask R-CNN.

Full Text