An Improved SSD Network for Small Object Detection based on Dilated Convolution and Feature Fusion

Jianlu Fu,Wenxuan Gong,Yunfeng Nie,Yufeng Wu

doi:10.1109/imcec51613.2021.9482158

Abstract

In view of the fact that the feature layers of different scales in the traditional single shot multibox detector (SSD) are independent of each other, resulting in poor detection performance for small objects. We propose an improved SSD network for small object detection based on dilated convolution and feature fusion, which is called DFSSD. Specifically, by introducing dilated convolution, we enhance the receptive field of the third-level feature map in the network, which enables the feature map to obtain more global information. At the same time, we designed feature fusion module to fuse low-level feature map with detailed information and high-level feature map with rich semantic information. We adjust the prediction box scale of the DFSSD network prediction layer. Our proposed network obtains 78.9% mAP on PASCAL VOC2007 test at 40 FPS and 74.7% AP on difficult objects in car class of KITTI dataset. The results outperform the original SSD model by 1.4 and 1.2 points respectively.

Full Text