Abstract

Aiming at the difficulty of feature extraction and the limitation of NMS (nonmaximum suppression) in crowded pedestrian detection, a new detection network named Double Mask R-CNN based on Mask R-CNN with FPN (Feature Pyramid Network) is proposed in this article. The algorithm has two improvements: firstly, we add a semantic segmentation branch on the FPN to strengthen the feature extraction of crowded pedestrians; secondly, we design a rule to estimate the pedestrian visibility of detected image according to the human keypoints information, and this rule can cover binary mask on the image whose pedestrian visibility is less than a certain threshold. Then we input the masked image into the network to locate occluded pedestrians. Experimental results on the CrowdHuman dataset show that the log-average miss rate (MR) of Double Mask R-CNN is 13, 12% lower than the best results of other mainstream networks. Similar improvements on WiderPerson dataset are also achieved by the Double Mask R-CNN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call