Abstract

Feature pyramid network is wildly applied to predict the targets at different scales in many one-stage detectors. However, most of them treat all feature levels without level-wise attention. As a result, different levels may capture the same object and predict totally different labels, which lead to lower AP on evaluation. Besides, many classic detectors aim at relieving the imbalance between hard and easy samples on classification task, while we find the fact that hard-easy imbalance also exists in localization. To general detection task, bboxes regressed at high IoU are much less than those at low IoU so that detector can easily be biased towards suboptimal bboxes and localization accuracy will be harmed. Based on these two points, in this paper, we propose a Scale-Adaptive Selection Network (SASNet) with a novel Dynamic Focal IoU (DF-IOU) loss. The Scale-Adaptive Selection Network introduces multi-scale attention mechanism into feature pyramid so as to assign attention weight for feature maps on each level, which enables the network to select the dominant levels for prediction and alleviate the prediction conflicts between different levels. Furthermore, we design Dynamic Focal IoU loss to increase loss contribution of easy targets so that the coordinates of these easy targets can regress better and the bounding box will fit tighter. Our experimental results show that our SASNet with DF-IoU loss can increase average precision of objects at small, medium and large scale on the MS COCO and CCPD dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call