Abstract

Because of the popularisation of high-resolution images, detecting objects in these images quickly and accurately has attracted increasing attention in recent studies. Current convolutional neural networks (CNN)-based detection methods have limitations in detecting small objects owing to the interference of scale variation. In this work, we propose an improved generic framework based on YOLOv3. Equipped with multiresolution supervision for training and multiresolution aggregation for inference, this method can deal with the challenge of scale variation in high-resolution images. At first, we move up the multiscale prediction position and add a dilated convolution module on YOLOv3 to improve the accuracy of detection, especially for small objects. Then, we present a coarse to fine method to reduce the detection time. Experiments on a COCO dataset show that our approach achieves 2.8% better accuracy compared with the previous YOLOv3. On a Dataset for Object deTection in Aerial images dataset (a high-resolution remote sensing dataset), our approach outperformed the YOLOv3 by nearly three percentage points in mean average precision. Moreover, it is up to three times faster as well and two times smaller than the previous YOLOv3.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call