Abstract

This paper addresses the problem of car detection from aerial images using Convolutional Neural Networks (CNNs). This problem presents additional challenges as compared to car (or any object) detection from ground images because the features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of three state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, as well as YOLOv3 and YOLOv4, which are known to be the fastest detection algorithms. We analyze two datasets with different characteristics to check the impact of various factors, such as the UAV’s (unmanned aerial vehicle) altitude, camera resolution, and object size. A total of 52 training experiments were conducted to account for the effect of different hyperparameter values. The objective of this work is to conduct the most robust and exhaustive comparison between these three cutting-edge algorithms on the specific domain of aerial images. By using a variety of metrics, we show that the difference between YOLOv4 and YOLOv3 on the two datasets is statistically insignificant in terms of Average Precision (AP) (contrary to what was obtained on the COCO dataset). However, both of them yield markedly better performance than Faster R-CNN in most configurations. The only exception is that both of them exhibit a lower recall when object sizes and scales in the testing dataset differ largely from those in the training dataset.

Highlights

  • Unmanned aerial vehicles (UAVs) are nowadays a key enabling technology for a large number of applications such as surveillance [1], tracking [2], disaster management [3], smart parking [4], and Intelligent Transportation Systems [5], to name a few

  • Our objective in this paper is different, since we focused on the depth-wise aspect of the comparison by selecting three recent algorithms that are representative of the two main categories of object detectors, namely Faster R-Convolutional Neural Networks (CNNs) [19]

  • We provide a thorough comparison between the three most sophisticated categories of CNN approaches for object detection, Faster RCCN, which is a region-based approach proposed in 2017, YOLOv3, which is still the most popular version of the You-Look-Only-Once approach proposed by Joseph Redmon in 2018, and the latest version YOLOv4, released by Bochkovskiy et al, in April 2020

Read more

Summary

Introduction

Unmanned aerial vehicles (UAVs) are nowadays a key enabling technology for a large number of applications such as surveillance [1], tracking [2], disaster management [3], smart parking [4], and Intelligent Transportation Systems [5], to name a few Thanks to their versatility, UAVs offer unique capabilities in collecting visual data using highresolution cameras from different locations, angles, and altitudes. UAV imagery has a much lower cost and provides more updated views (many satellite maps are several months old and do not present recent changes). It can be used for real-time image/video stream analysis in a much more affordable means. PSU+[27] UAV dataset: Training: 218 images (3365 car instances)

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call