Abstract

In object detection tasks, the detection of small size objects is very difficult since these small targets are always tightly grouped and interfered by background information. In order to solve this problem, we propose a novel network architecture based on YOLOv3 and a new feature fusion mechanism. We added multi-scale convolution kernels and differential receptive fields into YOLOv3 to extract the semantic features of the objects by using an Inception-like architecture. We also optimize the weights of feature fusion by selecting appropriate channel number ratios. Our model outperforms YOLOv3 when detecting small and easy clustering objects, such as airplane, bird, and person, and the detection speed is comparable with YOLOv3.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call