Abstract

Abstract. Detecting objects in aerial images is an important task in different environmental and infrastructure-related applications. Deep learning object detectors like RetinaNet offer decent detection performance; however, they require a large amount of annotated training data. It is well known that the collection of annotated data is a time consuming and tedious task, which often cannot be performed sufficiently well for remote sensing tasks since the required data must cover a wide variety of scenes and objects. In this paper, we analyze the performance of such a network given a limited amount of training data and address the research question of whether artificially generated training data can be used to overcome the challenge of real-world data sets with a small amount of training data. For our experiments, we use the ISPRS 2D Semantic Labeling Contest Potsdam data set for vehicle detection, where we derive object-bounding boxes of vehicles suitable for our task. We generate artificial data based on vehicle blueprints and show that networks trained only on generated data may have a lower performance, but are still able to detect most of the vehicles found in the real data set. Moreover, we show that adding generated data to real-world data sets with a limited amount of training data, the performance can be increased significantly, and in some cases, almost reach baseline performance levels.

Highlights

  • Object detection in aerial images is an important task in remote sensing applications like environmental monitoring, infrastructure surveillance, or traffic monitoring (Heipke, Rottensteiner, 2020, Ma et al, 2019)

  • Deep neural networks like RetinaNet (Lin et al, 2017b), which are designed for object detection, have shown to be suitable tools for solving such a task (Lin et al, 2017b, Sun et al, 2018)

  • We base our work on the ISPRS 2D Semantic Labeling Contest Potsdam data set 1, which consists of 38 image patches covering part of the city of Potsdam (Rottensteiner et al, 2013)

Read more

Summary

INTRODUCTION

Object detection in aerial images is an important task in remote sensing applications like environmental monitoring, infrastructure surveillance, or traffic monitoring (Heipke, Rottensteiner, 2020, Ma et al, 2019). Deep neural networks like RetinaNet (Lin et al, 2017b), which are designed for object detection, have shown to be suitable tools for solving such a task (Lin et al, 2017b, Sun et al, 2018). They have proven their capabilities in different benchmarks covering general imagery and aerial imagery (Lin et al, 2014b, Xia et al, 2018). We first evaluate the impact of limited amounts of realworld training data

RELATED WORK
DATA GENERATION
Deriving object detection annotations
Deficiencies of the derived data set
Generation of artificial data
EXPERIMENTS
Performance on complete and reduced real-world data sets
Artificially generated data performance
Joining real-world and generated data
CONCLUSION AND FUTURE WORK
Findings
Influence of data set deficiencies

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.