Abstract

Feature representation is a classic problem in the machine learning community due to the fact that different representations can entangle and hide more or less the different explanatory factors of variation behind the raw data. Especially for scene classification, its performance generally depends on the discriminative power of feature representation. Recently, unsupervised feature learning attracts tremendous attention because of its ability to learn feature representation automatically. However, reliable performance of feature representations by unsupervised learning always requires a large number of features and complex framework of mid-level feature representation. To alleviate such drawbacks, this paper presents a new framework of mid-level feature representation, which does not need learn many convolutional features during the unsupervised feature learning process, and has few parameter settings. In detail, the unsupervised feature learning method, sparse autoencoder, is employed to learn relatively small number of convolutional features from input dataset, and then extended features are extracted from the learned features by a multiple normalized difference features extraction method to compose a derivative feature set. At mid-level feature representation stage, in order to avoid poor performance of standard pooling technology in solving problems brought by rotation and translation of scene images, global feature descriptors (histogram moments, mean, variance, standard deviation) are utilized to build mid-level feature representations of images. For validation and comparison purposes, the proposed approach is evaluated via experiments with two challenging high-resolution remote sensing datasets. The results demonstrate that the approach is effective, and shows strong performance for remotely sensed scene classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call