Abstract

As a data-driven approach, deep learning requires a large amount of annotated data for training to obtain a sufficiently accurate and generalized model, especially in the field of computer vision. However, when compared with generic object recognition datasets, aerial image datasets are more challenging to acquire and more expensive to label. Obtaining a large amount of high-quality aerial image data for object recognition and image understanding is an urgent problem. Existing studies show that synthetic data can effectively reduce the amount of training data required. Therefore, in this paper, we propose the first synthetic aerial image dataset for ship recognition, called UnityShip. This dataset contains over 100,000 synthetic images and 194,054 ship instances, including 79 different ship models in ten categories and six different large virtual scenes with different time periods, weather environments, and altitudes. The annotations include environmental information, instance-level horizontal bounding boxes, oriented bounding boxes, and the type and ID of each ship. This provides the basis for object detection, oriented object detection, fine-grained recognition, and scene recognition. To investigate the applications of UnityShip, the synthetic data were validated for model pre-training and data augmentation using three different object detection algorithms and six existing real-world ship detection datasets. Our experimental results show that for small-sized and medium-sized real-world datasets, the synthetic data achieve an improvement in model pre-training and data augmentation, showing the value and potential of synthetic data in aerial image recognition and understanding tasks.

Highlights

  • In the past decade, deep learning methods have achieved milestones in various fields.For object recognition and scene understanding in aerial images, significant achievements have been made as a result of deep-learning-based algorithms

  • In the data augmentation experiments, a larger improvement was obtained for small datasets, and a larger improvement was obtained on the single-stage algorithms (FCOS and RetinaNet) algorithms than the two-stage algorithm (Faster RCNN)

  • We present the first synthetic dataset for ship identification in aerial images, UnityShip, captured and annotated using the Unity virtual engine; this comprises over 100,000 synthetic images and 194,054 ship instances

Read more

Summary

Introduction

Deep learning methods have achieved milestones in various fields.For object recognition and scene understanding in aerial images, significant achievements have been made as a result of deep-learning-based algorithms. It is relatively easy to obtain a large amount of data from the real world to constitute a sufficiently large dataset, such as VOC [1], COCO [2], OID [3], Objects365 [4], and other datasets, which usually contain tens of thousands or even millions of images, and dozens or even hundreds of categories of instance examples. These large-scale datasets have greatly promoted the development of computer vision algorithms and related applications.

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call