Abstract

One-stage object detection approach which utilizes multi-scale feature maps to predict objects is currently the best real-time detector. However, in this approach, the high-resolution feature maps which are responsible for detecting small objects are harder to learn a proper abstraction of objects than the low-resolution feature maps. The problem is that these feature maps have to transform sufficient low-level information to the next layer while learning high-level abstraction. In this paper, we develop a transformation module which adopts the dense structure to simplify the learning problem of high-resolution feature maps. In addition, we utilize the inception module to enrich the representation power of high-resolution feature maps. Extensive experiments on most object detection datasets clearly demonstrate the effectiveness of our method. In particular, on PASCAL VOC 2007/2012, our method outperforms all the existing one-stage methods. Our model based on the VGG-16 network also achieves competitive result on MS COCO.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call