Abstract

Object detection is one of the core tasks in computer vision that serves as a crucial underpinning for numerous applications. In recent years, deep learning-based methods have achieved remarkable performance in object detection. However, the performance of small objects still remains unsatisfactory. Therefore, some specific architectures have been proposed to address this issue in certain areas, such as remote-sensing and UAV images. In this paper, we aim to design a pluggable and non-intrusive method, termed as PatchDetector, to improve the performance of small object detection, which can effectively avoid the time and resource overhead of retraining the entire network. To achieve that, we first analyze why the mainstream networks perform poorly on small objects and find out that the fundamental reason is that the features of small are superseded by the background, which leads to a significant semantic gap in multi-level layers. Then, significance analysis is conducted to find the essential features for improving the small object detection. Next, with the located significant features, we devise a pluggable patch network for extracting essential features for small objects, which is non-intrusive to the original network. Experiments on mainstream detectors, including YOLO series and Faster RCNN, show that the proposed PatchDetector achieves 0.4%∼2.0% mAP on small objects while not compromising the performance of medium and large objects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call