Abstract

Unmanned aircraft systems or drones enable us to record or capture many scenes from the bird’s-eye view and they have been fast deployed to a wide range of practical domains, i.e., agriculture, aerial photography, fast delivery and surveillance. Object detection task is one of the core steps in understanding videos collected from the drones. However, this task is very challenging due to the unconstrained viewpoints and low resolution of captured videos. While deep-learning modern object detectors have recently achieved great success in general benchmarks, i.e., PASCAL-VOC and MS-COCO, the robustness of these detectors on aerial images captured by drones is not well studied. In this paper, we present an evaluation of state-of-the-art deep-learning detectors including Faster R-CNN (Faster Regional CNN), RFCN (Region-based Fully Convolutional Networks), SNIPER (Scale Normalization for Image Pyramids with Efficient Resampling), Single-Shot Detector (SSD), YOLO (You Only Look Once), RetinaNet, and CenterNet for the object detection in videos captured by drones. We conduct experiments on VisDrone2019 dataset which contains 96 videos with 39,988 annotated frames and provide insights into efficient object detectors for aerial images.

Highlights

  • Object detection is a fundamental yet difficult task in image processing and computer vision research

  • YOLOv3 uses a variant of Darknet, which originally has a 53-layer network trained on ImageNet

  • We report the performance results with Average Precision (AP) (IoU = 0.50), AP (IoU = 0.75)

Read more

Summary

Introduction

Object detection is a fundamental yet difficult task in image processing and computer vision research. The deep-learning technology has brought significant breakthroughs in recent years. These techniques have produced remarkable development for object detection. Are some common challenges that object detectors face on aerial images: viewpoints, illuminations, scale variations, perspectives, intra-class variations, low resolutions, and occlusions. The research community has focused on deep learning and its applications towards the object recognition/detection tasks. While the challenges of normal viewpoints have been considered to be the prevalence, in recent years, there has been increasing interest in flying drones and their applications in healthcare, video surveillance, search-and-rescue, and agriculture. Object detection is a very challenging task since video sequences or images captured by drones vary significantly in terms of scales, perspectives, and weather conditions.

CNN Models
Object Detection Methods
Faster R-CNN
RFCN: Region-Based Fully Convolutional Networks
SNIPER
Chip Selection
Base Network VGG-16
Model Architecture
RetinaNet
Class Imbalance
RetinaNet Detector Architecture
YOLO: You Only Look Once
Feature Extraction
Detection at Three Scales
Objective Score and Confidences
CenterNet
Object as Points
From Points to Bounding Boxes
Dataset
Evaluation Metrics
Model Configuration
Results
Analysis of Feature Maps Extraction
Discussion
Conclusion and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.