Training lightweight network from scratch for efficient object detection in aerial images

Ang Su,Pengyu Guo,Banglei Guan

doi:10.1117/12.2535479

Ang Su, Pengyu Guo + Show 1 more

https://doi.org/10.1117/12.2535479

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Object detection in aerial images plays an important role for a wide range of applications. Although many efforts have been done in the last decade, it is still an active and challenging problem because of the highly complex backgrounds and the large variations in the visual appearance of objects caused by viewpoint variation, occlusion, illumination, etc. Recently, many object detectors based on deep learning demonstrate the great advantages for significantly improving the detection performance in aerial images. However, the most accuracy neural networks usually have hundreds of layers and thousands of channels, thus requiring huge computation and memory consumption. Besides, the state-of-the-art object detectors are usually fined-tuned from the models pretrained on classification dataset ImageNet, which limits the modification of network architecture and also leads to learning bias because of the different domains. In this paper we trained a lightweight convolutional neural network from scratch to perform object detection in aerial images. When designing the lightweight network, Concatenated Rectified Linear Units (CReLU) and depthwise separable convolution operation were employed to reduce the computation cost and model size. When training the lightweight network from scratch, we employ Group Normalization (GN) in each convolution layer, which makes smoother optimization landscape and has more stable gradients. A serial of ablation experiments is conducted on the recently published large-scale Dataset for Object detection in Aerial images (DOTA), and the results show that the proposed object detection methods with lightweight network trained from scratch achieves competitive performance but has smaller model size and lower computation cost.

Full Text