Abstract

In view of the fact that the feature layers of different scales in the traditional single shot multibox detector (SSD) are independent of each other, resulting in poor detection performance for small objects. We propose an improved SSD network for small object detection based on dilated convolution and feature fusion, which is called DFSSD. Specifically, by introducing dilated convolution, we enhance the receptive field of the third-level feature map in the network, which enables the feature map to obtain more global information. At the same time, we designed feature fusion module to fuse low-level feature map with detailed information and high-level feature map with rich semantic information. We adjust the prediction box scale of the DFSSD network prediction layer. Our proposed network obtains 78.9% mAP on PASCAL VOC2007 test at 40 FPS and 74.7% AP on difficult objects in car class of KITTI dataset. The results outperform the original SSD model by 1.4 and 1.2 points respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call