Abstract

Deep convolutional neural networks (CNNs) have been widely used to obtain high-level representation in various computer vision tasks. However, in the field of remote sensing, there are not sufficient images to train a useful deep CNN. Instead, we tend to transfer successful pre-trained deep CNNs to remote sensing tasks. In the transferring process, generalization power of features in pre-trained deep CNNs plays the key role. In this paper, we propose two promising architectures to extract general features from pre-trained deep CNNs for remote scene classification. These two architectures suggest two directions for improvement. First, before the pre-trained deep CNNs, we design a linear PCA network (LPCANet) to synthesize spatial information of remote sensing images in each spectral channel. This design shortens the spatial “distance” of target and source datasets for pre-trained deep CNNs. Second, we introduce quaternion algebra to LPCANet, which further shortens the spectral “distance” between remote sensing images and images used to pre-train deep CNNs. With five well-known pre-trained deep CNNs, experimental results on three independent remote sensing datasets demonstrate that our proposed framework obtains state-of-the-art results without fine-tuning and feature fusing. This paper also provides baseline for transferring fresh pretrained deep CNNs to other remote sensing tasks.

Highlights

  • Remote sensing image processing achieves great advances in recent years, from low-level tasks, such as segmentation, to high-level ones, such as classification [1,2,3,4,5,6,7]

  • We evaluate the performance of this framework for remote scene classification in different conditions, and explore the way in which the linear principal component analysis (PCA) network (LPCANet) synthesizes spatial and spectral information of remote sensing images and enhances the generalization power of pre-trained deep convolutional neural networks (CNNs)

  • We propose two scenarios to test the performance of the LPCANet on extracting general features for pre-trained deep CNNs in space and spectrum, respectively: (1)

Read more

Summary

Introduction

Remote sensing image processing achieves great advances in recent years, from low-level tasks, such as segmentation, to high-level ones, such as classification [1,2,3,4,5,6,7]. Classifying remote sensing images according to a set of semantic categories is a very challenging problem, because of high intra-class variability and low inter-class distance [5,6,7,8,9]. By constructing a holistic scene representation, the bag-of-visual-words (BOW) model becomes one of the most popular approaches for solving the scene classification problem in the remote sensing community [10]. Many variant methods based on the BOW model have been developed for improving the discriminative ability of the “visual words” [11,12,13]

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call