Integrating Adversarial Generative Network with Variational Autoencoders towards Cross-Modal Alignment for Zero-Shot Remote Sensing Image Scene Classification

Suqiang Ma,Chun Liu,Zheng Li,Wei Yang

doi:10.3390/rs14184533

Abstract

Remote sensing image scene classification takes image blocks as classification units and predicts their semantic descriptors. Because it is difficult to obtain enough labeled samples for all classes of remote sensing image scenes, zero-shot classification methods which can recognize image scenes that are not seen in the training stage are of great significance. By projecting the image visual features and the class semantic features into the latent space and ensuring their alignment, the variational autoencoder (VAE) generative model has been applied to address remote-sensing image scene classification under a zero-shot setting. However, the VAE model takes the element-wise square error as the reconstruction loss, which may not be suitable for measuring the reconstruction quality of the visual and semantic features. Therefore, this paper proposes to augment the VAE models with the generative adversarial network (GAN) to make use of the GAN’s discriminator in order to learn a suitable reconstruction quality metric for VAE. To promote feature alignment in the latent space, we have also proposed cross-modal feature-matching loss to make sure that the visual features of one class are aligned with the semantic features of the class and not those of other classes. Based on a public dataset, our experiments have shown the effects of the proposed improvements. Moreover, taking the ResNet models of ResNet18, extracting 512-dimensional visual features, and ResNet50 and ResNet101, both extracting 2048-dimensional visual features for testing, the impact of the different visual feature extractors has also been investigated. The experimental results show that better performance is achieved by ResNet18. This indicates that more layers of the extractors and larger dimensions of the extracted features may not contribute to the image scene classification under a zero-shot setting.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote Sensing	Publication Date: Sep 11, 2022
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Integrating Adversarial Generative Network with Variational Autoencoders towards Cross-Modal Alignment for Zero-Shot Remote Sensing Image Scene Classification

Abstract

Talk to us

Similar Papers

More From: Remote Sensing

Lead the way for us

Similar Papers

Multi-level Cross-modal Feature Alignment via Contrastive Learning towards Zero-shot Classification of Remote Sensing Image Scenes
...
arXiv (Cornell University) | VOL. -
, et. al. ...
31 May 2023
arXiv (Cornell University) | VOL. -

Representation Learning of Remote Sensing Knowledge Graph for Zero-Shot Remote Sensing Image Scene Classification
Yansheng Li ... Deyu Kong
-
Yansheng Li, et. al.Yansheng Li ... Deyu Kong
11 Jul 2021
11 Jul 2021

Robust deep alignment network with remote sensing knowledge graph for zero-shot and generalized zero-shot remote sensing image scene classification
Yansheng Li ... Ling Chen
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 179
Yansheng Li, et. al.Yansheng Li ... Ling Chen
10 Aug 2021
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 179

A Scene Images Diversity Improvement Generative Adversarial Network for Remote Sensing Image Scene Classification
Xin Pan ... Jian Zhao
IEEE Geoscience and Remote Sensing Letters | VOL. 17
Xin Pan, et. al.Xin Pan ... Jian Zhao
05 Dec 2019
IEEE Geoscience and Remote Sensing Letters | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Integrating Adversarial Generative Network with Variational Autoencoders towards Cross-Modal Alignment for Zero-Shot Remote Sensing Image Scene Classification

Abstract

Talk to us

Similar Papers

More From: Remote Sensing