Abstract
Scene classification relying on images is essential in many systems and applications related to remote sensing. The scientific interest in scene classification from remotely collected images is increasing, and many datasets and algorithms are being developed. The introduction of convolutional neural networks (CNN) and other deep learning techniques contributed to vast improvements in the accuracy of image scene classification in such systems. To classify the scene from areal images, we used a two-stream deep architecture. We performed the first part of the classification, the feature extraction, using pre-trained CNN that extracts deep features of aerial images from different network layers: the average pooling layer or some of the previous convolutional layers. Next, we applied feature concatenation on extracted features from various neural networks, after dimensionality reduction was performed on enormous feature vectors. We experimented extensively with different CNN architectures, to get optimal results. Finally, we used the Support Vector Machine (SVM) for the classification of the concatenated features. The competitiveness of the examined technique was evaluated on two real-world datasets: UC Merced and WHU-RS. The obtained classification accuracies demonstrate that the considered method has competitive results compared to other cutting-edge techniques.
Highlights
Introduction and ContributionsIn our article, we use a remote sensing image classification architecture that leverages pre-trainedconvolutional neural networks (CNN) models that are able to classify high-resolution aerial images
We present a technique for feature extraction utilizing pre-trained neural networks and perform dimensionality reduction of the dense CNN activations from the convolutional layers using the Principal Component Analysis (PCA)
Classification Founded on Extracted Features from Different CNN Layers
Summary
We use a remote sensing image classification architecture that leverages pre-trained. CNN models that are able to classify high-resolution aerial images. The models are fully trained on the ImageNet [36] dataset. These CNNs perform feature extraction by removing some of the layers of the original pre-trained network. We use activations from the average pooling layer, last convolutional layer, and from some of the intermediate convolutional layers over the entire image, in order to obtain feature representations of the scene. We get a convolutional feature vector with significant dimensionality For this reason, feature dimensionality reduction methods are utilized prior to concatenating these features with the features extracted from average pooling layers. Our article proposes two widely used linear classifiers—linear SVM and Logistic Regression Classifier (LRC)—to process the extracted features and classify the scenes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.