Object Detection Using Deep Learning: Single Shot Detector with a Refined Feature-fusion Structure

Shili Chen,Jie Hong,Yisheng Guan,Tao Zhang,Jian Li

doi:10.1109/rcar47638.2019.9044027

Abstract

In order to improve the detection accuracy of objects at different scales, the most recent studies applied multilayer architecture. However, the extracted low-level feature in the shallow layers may not work perfectly on the detection performance due to its less semantic information, especially for small objects. In this paper, we propose a refined feature-fusion structure to be integrated with single shot detector (SSD). To obtain the rich representation ability for feature mapping, in the fusion block, the deconvolution operation is basically applied to fuse high-level semantic features and low-level semantic features. It is noteworthy that in the proposed framework, the feature pyramid network is modified to better describe the features by the skip connection. An adaptive weighted connection is designed at the feature-fusion block, which further enhances the performance of the detection. On PASCAL VOC2007 test set, the experimental results show that the mAP of the proposed network is higher than SSD and deconvolutional single shot detector (DSSD) by 2.03% and 0.63%, respectively. Meanwhile, the speed of our method is as 2.2 times fast as the DSSD. Furthermore, the mAP of our refined feature-fusion structure SSD is 6.2% higher than SSD on the small object test set of PASCAL VOC2007, which verifies the effectiveness of the proposed model.

Full Text