Abstract

The infrared image super-resolution (SR) method successfully overcomes the hardware limitations of infrared cameras that reconstruct higher-quality images with improved efficiency and cost-effectiveness. However, existing infrared SR methods do not take into account the specificity of infrared images and are primarily designed for small-scale factors. In this paper, we first investigate the domain differences between infrared and visible images and the impact of these differences on super-resolution tasks. It was found that compared with visible SR, the infrared SR will require more global edge structure information rather than local texture information reconstructed by previous CNN-based methods mainly. To address this disparity, we propose a novel infrared SR model, named DASR, which incorporates a Transformer with spatial and channel dual-attention mechanisms. In DASR, spatial attention captures both local and global information through window-based and cross-window contextual long-range interactions, while channel attention captures channel-wise global information through cross-channel interactions. With this new Transformer architecture, our method effectively extracts spatial and channel global information, which cannot be captured by the local receptive field of convolution, thus making it more suitable for infrared SR. Extensive experiments on benchmark datasets have indicated that our method outperforms the state-of-the-art infrared SR methods with less number of parameters and lower computational complexity while having the best visual results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call