Abstract

ABSTRACT Recently, there has been renewed interest in object detection in computer vision. Several attempts have been made to improve small objects recognition, particularly in aerial images, due to their weak representative features and low resolution. However, these instances are still a challenge. The principal objective of this paper is to introduce a Fused RetinaNet detector, an enhanced RetinaNet with a novel context fusion module instead of the feature pyramid network (FPN), to improve low layers semantic information and top layers spatial resolution. This module firstly aggregates multi-scale backbone feature maps at once to make a robust shallow layer. Then, a parallel-branch dilated module is proposed in front of each network level to present more context information. Finally, aggregation and dilated modules are laterally connected at each backbone level via a bottom-up path approach to compensate for information loss during downsampling. Comprehensive experiments are carried out on NWPU VHR-10 and DOTA aerial image datasets. The evaluation reveals the proposed Fused RetinaNet superiority in aerial images small target detection. It achieves a mean average precision (mAP) value of 91.83% and 57.13% on NWPU VHR-10 and DOTA, respectively. Also, it scores a mAP of 62.59% on DOTA when the k-means clustering algorithm is used.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call