Abstract

In recent years, many convolutional neural network (CNN)-based methods have been proposed to address the scene classification tasks of remote sensing images. Since the number of training samples in RS datasets is generally small, data augmentation is often used to expand the training set. It is, however, not appropriate when original data augmentation methods keep the label and change the content of the image at the same time. In this study, label augmentation (LA) is presented to fully utilize the training set by assigning a joint label to each generated image, which considers the label and data augmentation at the same time. Moreover, the output of images obtained by different data augmentation is aggregated in the test process. However, the augmented samples increase the intra-class diversity of the training set, which is a challenge to complete the following classification process. To address the above issue and further improve classification accuracy, Kullback–Leibler divergence (KL) is used to constrain the output distribution of two training samples with the same scene category to generate a consistent output distribution. Extensive experiments were conducted on widely-used UCM, AID and NWPU datasets. The proposed method can surpass the other state-of-the-art methods in terms of classification accuracy. For example, on the challenging NWPU dataset, competitive overall accuracy (i.e., 91.05%) is obtained with a 10% training ratio.

Highlights

  • With the advancement of imaging technology, remote sensing (RS) images have a higher resolution than before

  • The above results indicated that the remote sensing images are sensitive to color permutation, and it is improper to directly assign the original label to the new image generated by the color permutation

  • The color transformation changed the content of the image, which increased the complexity of the classification task

Read more

Summary

Introduction

With the advancement of imaging technology, remote sensing (RS) images have a higher resolution than before. Scene classification was accomplished by using the low-level features, including color histograms (CH) [10], texture [11,12] and scale invariant feature transform (SIFT) [13]. These methods relied on engineering skills and experts’. To resolve the limitation of the low-level feature-based classification methods, many methods, which aggregate the extracted local low-level visual features to generate mid-level scene representation, have been proposed to achieve good performance on the scene classification task. As one of the most commonly used methods based on mid-level visual features, Remote Sens.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call