Abstract

Single-stage detectors depend on a simple regression network to predict category scores and regress box offsets for a fixed set of default boxes directly. The regression network needs to have high generalization capability, so as to accurately model the relationship between various object shapes and default boxes. Instead of complicating the regression network to increase generalization capability, we iteratively refine the default boxes to model this relationship sequentially. In this paper, we propose an Attention-Enhanced Progressive Learning Network (APLNet), which employs multiple stages for progressive detection to improve performance of single-stage detectors. Specifically, a progressive learning module is proposed to iteratively update the feature representation space and gradually regress the default boxes, which are pushed closer to the target objects progressively. In addition, since low-level features have less semantic information about objects, we design an attention enhancement module to generate the attention map applied to inject more semantically meaningful information into the low-level features. This module is supervised by boxes-induced segmentation annotations, i.e., no extra segmentation annotations are required. The multi-task loss function is used to train the whole network in an end-to-end way. Extensive experiments on PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO datasets demonstrate the effectiveness of the proposed APLNet.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call