Abstract

The hyperspectral image (HSI) has nearly continuous spectral information, thus, the target of interest can be accurately identified by the subtle details of spectral properties. Spectral resolution at different scales can capture different levels of spectral features: small-scale spectral bands are beneficial for extracting global details in vision transformers, while large-scale spectral bands are more effective for local features. Transformer shows advantages in global information extraction with self-attention module and even surpasses CNNs in various tasks. Some works based on the vision transformer have performed surprisingly in HSI classification. However, single-scale vision transformers are insufficient to balance the extraction of local details and redundancy on different scales. The recent work, a multi-scale vision transformer, has provided a solution with spatial patch-wise features in image classification. Inspired by this, we propose the Cross-spectral vision transformer (CSiT) with two branches to extract pixel-wise multi-scale features and further design a multi-scale spectral embedding module to enhance local details between neighboring spectral bands. Moreover, based on the cross-attention operation, a single token for each branch is recognized as a query and used to exchange information with other branches. We evaluate the classification performance of the proposed CSiT in three classic HSI datasets with extensive experiments, showing the multi-scale vision transformer architecture has a promising result for HSI classification with one-dimensional spectral bands.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.