Effective Contexts for UAV Vehicle Detection

Jianxiu Yang,Xuemei Xie,Wenzhe Yang

doi:10.1109/access.2019.2923407

Jianxiu Yang, Xuemei Xie + Show 1 more

Open Access

https://doi.org/10.1109/access.2019.2923407

Copy DOI

Abstract

Vehicle detection based on unmanned aerial vehicle (UAV) images is a challenging task for the small size of objects, complex background, and the imbalance of various vehicle samples. This paper proposes a high-performance UAV vehicle detector. We use the single-shot refinement neural network (RefineDet) as a base network, which employs the top-down architecture to offer contextual information, achieving accurate detection. However, for the small size of vehicles, the top–down architecture introduces too much context, which brings surrounding interference. We present a multi-scale adjacent connection module (ACM) to provide effective contextual information and reduce interference for vehicle detection. In addition, we adopt an alternate double loss training strategy (ADT) to solve the problem of imbalance between hard and easy examples during training, and we design suitable default boxes according to the distribution of the UAV dataset to improve the recall rate. Our method achieves 92.0% and 90.4% accuracy on the collected UAV dataset and the publicly available Stanford drone dataset, respectively. And, the proposed detector can run at 58 FPS on a single GPU.

Highlights

Unmanned aerial vehicle (UAV) has been used in many applications, such as transportation monitoring [1]–[3], road traffic information estimating [4], and traffic imagery collecting [5]
We have made the following main contributions: (1) We present a multi-scale adjacent connection module (ACM) to provide effective contextual information and reduce interference for accurate small-sized vehicles detection
We focus on vehicle detection based on unmanned aerial vehicle images

Summary

Introduction

Unmanned aerial vehicle (UAV) has been used in many applications, such as transportation monitoring [1]–[3], road traffic information estimating [4], and traffic imagery collecting [5]. Vehicle detection is a challenging task for the small size of objects, complex background, and the imbalance of various vehicle samples. To solve these problems, many CNN-based vehicle detectors have been proposed in recent years. The vehicle detection methods [12]–[14] based on Faster R-CNN [15] used contextual information to enhance the feature representation ability. These methods can guarantee high accuracy but are not suitable for real-time applications. The real-time method based on YOLOv2 [16] has been proposed to detect vehicles in UAV images

Objectives

Methods

Findings

Conclusion