Abstract

In order to improve the detection accuracy of objects at different scales, the most recent studies applied multilayer architecture. However, the extracted low-level feature in the shallow layers may not work perfectly on the detection performance due to its less semantic information, especially for small objects. In this paper, we propose a refined feature-fusion structure to be integrated with single shot detector (SSD). To obtain the rich representation ability for feature mapping, in the fusion block, the deconvolution operation is basically applied to fuse high-level semantic features and low-level semantic features. It is noteworthy that in the proposed framework, the feature pyramid network is modified to better describe the features by the skip connection. An adaptive weighted connection is designed at the feature-fusion block, which further enhances the performance of the detection. On PASCAL VOC2007 test set, the experimental results show that the mAP of the proposed network is higher than SSD and deconvolutional single shot detector (DSSD) by 2.03% and 0.63%, respectively. Meanwhile, the speed of our method is as 2.2 times fast as the DSSD. Furthermore, the mAP of our refined feature-fusion structure SSD is 6.2% higher than SSD on the small object test set of PASCAL VOC2007, which verifies the effectiveness of the proposed model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.