In recent years, convolutional neural networks and vision transformers have emerged as predominant models for hyperspectral remote sensing image classification task, leveraging staked convolution layers and self-attention mechanisms with high computation costs, respectively. Recent studies, such as the Mamba model, have showcased the ability of state space model (SSM) with efficient hardware-aware designs in efficiently modeling sequences and extracting implicit features along tokens, which is precisely needed for accurate hyperspectral image (HSI) classification. Thus making SSM-based model potentially a new architecture for remote sensing HSI classification task. However, SSM encounters challenges in modeling HSI due to the insensitivity of spatial information and redundant spectral characteristics. Given SSM-based methods rarely explored in HSI classification, in this work, we present the first exploration of SSM-based models for HSI classification task. Our proposed method MamTrans effectively leverages the capacity of transformer for capturing spatial tokens relationships and Mamba for extracting implicit features along tokens. Besides, we propose a Bidirectional Mamba Module to enhance SSM’s spatial perception ability of extracting spatial features in HSI. Our proposed MamTrans obtains a new state-of-the-art performance across five commonly employed HSI classification benchmarks, demonstrating the robust generalization of MamTrans and effectiveness of SSM-based structure for HSI classification task. Our codes could be found at https://github.com/PPPPPsanG/MamTrans.
Read full abstract