Abstract

With the rapid development of Unmanned Aerial Vehicles, vehicle detection in aerial images plays an important role in different applications. Comparing with general object detection problems, vehicle detection in aerial images is still a challenging research topic since it is plagued by various unique factors, e.g. different camera angle, small vehicle size and complex background. In this paper, a Feature Fusion Deep-Projection Convolution Neural Network is proposed to enhance the ability to detect small vehicles in aerial images. The backbone of the proposed framework utilizes a novel residual block named stepwise res-block to explore high-level semantic features as well as conserve low-level detail features at the same time. A specially designed feature fusion module is adopted in the proposed framework to further balance the features obtained from different levels of the backbone. A deep-projection deconvolution module is used to minimize the impact of the information contamination introduced by down-sampling/up-sampling processes. The proposed framework has been evaluated by UCAS-AOD, VEDAI, and DOTA datasets. According to the evaluation results, the proposed framework outperforms other state-of-the-art vehicle detection algorithms for aerial images.

Highlights

  • As one of the core research topic of computer vision, object detection is widely used in automatic driving, crowd flow counting, topographic exploration, environmental pollution monitoring, etc

  • It can be seen that the proposed feature fusion and deepprojection module (FFDP)-Convolution Neural Network (CNN) detector achieves an Average Precision (AP) of 97.34% which is roughly 1.2% higher than the APs achieved by Method YOLO v2 2017 [43]

  • The P-R curve and the detailed performance of the proposed FFDP-CNN detector are shown in Fig 6 and Tables 7, 8 respectively

Read more

Summary

Introduction

As one of the core research topic of computer vision, object detection is widely used in automatic driving, crowd flow counting, topographic exploration, environmental pollution monitoring, etc. The task of object detection is to find out the various targets in the images, and determine the locations and categories of these targets. Because the appearance of objects changes significantly according to various factors [1,2,3,4], object detection is commonly regarded as one of the most challenging tasks in the field of computer vision. The traditional object detection algorithms are normally based on hand-crafted features or textures. R.M. Haralick et al proposed textural features for image classification in 1973 [5]. Lowe proposed scale-invariant feature transform (SIFT) in 1999 [6]. Melgani used SIFT to count the number of vehicles and trained a support vector

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call