Multiscale Object Detection from Drone Imagery Using Ensemble Transfer Learning

Rahee Walambe,Aboli Marathe,Ketan Kotecha

doi:10.3390/drones5030066

Rahee Walambe, Aboli Marathe + Show 1 more

Open Access

https://doi.org/10.3390/drones5030066

Copy DOI

Abstract

Object detection in uncrewed aerial vehicle (UAV) images has been a longstanding challenge in the field of computer vision. Specifically, object detection in drone images is a complex task due to objects of various scales such as humans, buildings, water bodies, and hills. In this paper, we present an implementation of ensemble transfer learning to enhance the performance of the base models for multiscale object detection in drone imagery. Combined with a test-time augmentation pipeline, the algorithm combines different models and applies voting strategies to detect objects of various scales in UAV images. The data augmentation also presents a solution to the deficiency of drone image datasets. We experimented with two specific datasets in the open domain: the VisDrone dataset and the AU-AIR Dataset. Our approach is more practical and efficient due to the use of transfer learning and two-level voting strategy ensemble instead of training custom models on entire datasets. The experimentation shows significant improvement in the mAP for both VisDrone and AU-AIR datasets by employing the ensemble transfer learning method. Furthermore, the utilization of voting strategies further increases the 3reliability of the ensemble as the end-user can select and trace the effects of the mechanism for bounding box predictions.

Highlights

The number of computer vision (CV) tasks such as object detection and image segmentation have gained extreme popularity in the last few decades
We focus on Object detection (OD) from drone images of two separate datasets: the VisDrone 2019 test dev set [10,11] and the AU-AIR dataset [12] using our novel framework based on the algorithm proposed in [13]
In this work, to tackle these challenges in uncrewed aerial vehicle (UAV) image datasets and to solve the general problem of deficient UAV detection sets, we propose and demonstrate an implementation of an ensemble algorithm by using the transfer learning from baseline OD models and data augmentation technique on the VisDrone 2019 dataset [10,11] and AU-AIR dataset [12]

Summary

Introduction

The number of computer vision (CV) tasks such as object detection and image segmentation have gained extreme popularity in the last few decades. Object detection (OD) is challenging and useful for detecting the various visual objects of a specific class (such as cars, pedestrians, animals, terrains, etc.) in the images. OD deals with the development of computational models and techniques and is one of the fundamental problems of computer vision. It is a basis of other tasks such as segmentation [1,2,3,4], image captioning [5,6,7], and object tracking [8,9], etc. We focus on OD from drone images of two separate datasets: the VisDrone 2019 test dev set [10,11] and the AU-AIR dataset [12] using our novel framework based on the algorithm proposed in [13]

Methods

Results

Discussion

Conclusion