On the Co-Selection of Vision Transformer Features and Images for Very High-Resolution Image Scene Classification

Souleyman Chaib,Dou El Kefel Mansouri,Ahmed Hagag,Sahraoui Dhelim,Ibrahim Omara,Djamel Amar Bensaber

doi:10.3390/rs14225817

Abstract

Recent developments in remote sensing technology have allowed us to observe the Earth with very high-resolution (VHR) images. VHR imagery scene classification is a challenging problem in the field of remote sensing. Vision transformer (ViT) models have achieved breakthrough results in image recognition tasks. However, transformer–encoder layers encode different levels of features, where the latest layer represents semantic information, in contrast to the earliest layers, which contain more detailed data but ignore the semantic information of an image scene. In this paper, a new deep framework is proposed for VHR scene understanding by exploring the strengths of ViT features in a simple and effective way. First, pre-trained ViT models are used to extract informative features from the original VHR image scene, where the transformer–encoder layers are used to generate the feature descriptors of the input images. Second, we merged the obtained features as one signal data set. Third, some extracted ViT features do not describe well the image scenes, such as agriculture, meadows, and beaches, which could negatively affect the performance of the classification model. To deal with this challenge, we propose a new algorithm for feature- and image selection. Indeed, this gives us the possibility of eliminating the less important features and images, as well as those that are abnormal; based on the similarity of preserving the whole data set, we selected the most informative features and important images by dropping the irrelevant images that degraded the classification accuracy. The proposed method was tested on three VHR benchmarks. The experimental results demonstrate that the proposed method outperforms other state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote Sensing	Publication Date: Nov 17, 2022
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

On the Co-Selection of Vision Transformer Features and Images for Very High-Resolution Image Scene Classification

Abstract

Talk to us

Similar Papers

More From: Remote Sensing

Lead the way for us

Similar Papers

Very High Resolution Image Scene Classification with Semantic Fisher Vectors
Souleyman Chaib ... Hongxun Yao
-
Souleyman Chaib, et. al.Souleyman Chaib ... Hongxun Yao
01 Jul 2018
01 Jul 2018

Very High Resolution Image Scene Classification with Capsule Network
Souleyman Chaib ... Yanfeng Gu
-
Souleyman Chaib, et. al.Souleyman Chaib ... Yanfeng Gu
01 Jul 2019
01 Jul 2019

Deep Feature Fusion for VHR Remote Sensing Scene Classification
Souleyman Chaib ... Yanfeng Gu
IEEE Transactions on Geoscience and Remote Sensing | VOL. 55
Souleyman Chaib, et. al.Souleyman Chaib ... Yanfeng Gu
01 Aug 2017
IEEE Transactions on Geoscience and Remote Sensing | VOL. 55

Multilabel classification of remote sensed satellite imagery
Ajay Kumar ... Dhiren Patel
Transactions on Emerging Telecommunications Technologies | VOL. 32
Ajay Kumar, et. al.Ajay Kumar ... Dhiren Patel
31 May 2020
Transactions on Emerging Telecommunications Technologies | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Co-Selection of Vision Transformer Features and Images for Very High-Resolution Image Scene Classification

Abstract

Talk to us

Similar Papers

More From: Remote Sensing