Abstract

Object classification and localization are two significant aspects of object detector based on the Single Shot MultiBox Detector (SSD). In general, the more feature maps there are, the better the object classification performance will be. However, when the information of excessive feature maps are sparse and unnecessary, the performance of object detection is slightly improved or maybe precisely opposite, which is instead harmful to the production of object localization. The performance of object detectors is not only related to the number of feature maps but also relies partly on the bounding box regression and Non-Maximum Suppression (NMS). In this paper, a detector is constructed based on SSD, called Detection with Refined Feature (DRF), involving center map and scale map, the detection loss is reshaped. Our motivation is to improve the accuracy of classification and localization by searching for central points and predicting the scales of the object points. Center map is used to predict the Intersection over Union (IoU) between the prediction box and ground truth box, while scale map considers the relationships among the different scales. Experimental results on both Pascal VOC and MS COCO 2014 instance datasets demonstrate the effectiveness of DRF. Using Darknet53, we achieve an 86.4% mean Average Precision (mAP) on Pascal VOC2007 and an 87.4% mAP on Pascal VOC2007 and VOC2012. On MS COCO, the DRF with ResNet50 still achieves moderate improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call