Abstract

Recently, transformers have achieved great success in a number of computer vision tasks due to their excellent ability to capture long-range feature dependencies. In contrast, convolutional neural networks (CNNs) are good at extracting local features. Given that the capture of short- and long-range band dependencies are both important for hyperspectral data classification, we propose MCE-ST, a convolution-transformer (conformer) based framework capable of exploiting the complementary strengths of transformers and CNNs. In contrast to the conventional transformer, which uses a linear projection for tokenization, the proposed MCE-ST uses a convolution-based tokenization method to extract local dependency between spectral bands. Moreover, since different hyperspectral samples may have different spans of local relationships, a multiscale conformer encoder (MCE) comprising two separate branches of depth-wise dilated convolution with different kernel sizes is used to extract the different spans of the local interactions between tokens. We conducted experiments on four salt stress datasets and one cassava disease dataset. The results show that the proposed MCE-ST outperforms the state-of-the-art techniques for crop stress classification using hyperspectral data. The code for MCE-ST is publicly available at https://github.com/Weejaa04/MCE-ST-GitHub.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call