Abstract

In recent computer vision tasks, the vision transformer (ViT) has demonstrated competitive ability. However, ViT still has problems: the computational complexity of the self-attention layer leads to expensive and slow interference, and processing all tokens for high-resolution images may slow down due to the layer’s quadratic complexity. Recently, a retentive network with excellent performance, training parallelism, and an inexpensive inference cost was proposed. For hyperspectral image (HSI) classification, this paper proposesa retention-based network model called the HSI retentive network (HSIRN). The proposed model allows memory usage independent of the token’s sequence, facilitating the efficient processing of high-resolution images with low inference and computational costs. Although the retention encoder can extract global data, it pays limited attention to local data. A powerful tool for extracting local information is a convolutional neural network (CNN). The proposed HSIRN model uses a specific CNN-based block to extract local spectral-spatial information. To maintain degradation between successive vertical and horizontal positions with the depth dimension of the HSI, we propose a three-dimensional retention mechanism for the three-dimensional HSI dataset in the retention encoder. By efficiently using both local and global spectral-spatial information, the proposed method offers a potent tool for HSI classification. We evaluated the classification performance of the proposed HSIRN approach on four datasets through comprehensive examinations, and the results demonstrated its superiority over state-of-the-art methods. At https://github.com/RajatArya22/HSIRN, the source code will be available to the public.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.