Although pedestrian detection has achieved promising performance with the development of deep learning techniques, it remains a great challenge to detect heavily occluded pedestrians in crowd scenes. Therefore, to make the anchor-free network pay more attention to learning the hard examples of occluded pedestrians, we propose a simple but effective Occlusion-aware Anchor-Free Network (namely OAF-Net) for pedestrian detection in crowd scenes. Specifically, we first design a novel occlusion-aware detection head, which includes three separate center prediction branches combining with the scale and offset prediction branches. In the detection head of OAF-Net, occluded pedestrian instances are assigned to the most suitable center prediction branch according to the occlusion level of human body. To optimize the center prediction, we accordingly propose a novel weighted Focal Loss where pedestrian instances are assigned with different weights according to their visibility ratios, so that the occluded pedestrians are up-weighted during the training process. Our OAF-Net is able to model different occlusion levels of pedestrian instances effectively, and can be optimized towards catching a high-level understanding of the hard training samples of occluded pedestrians. Experiments on the challenging CityPersons, Caltech, and CrowdHuman benchmarks sufficiently validate the efficacy of our OAF-Net for pedestrian detection in crowd scenes.
Read full abstract