Abstract

Infrared Small Target Detection (IRSTD) aims to detect small and dim targets in complex backgrounds. However, the low signal-to-noise ratio and reduced contrast in the infrared domain make it challenging to extract these targets, as the cluttered background can easily overpower them. Existing Convolutional Neural Networks (CNN)-based methods for IRSTD often suffer from information loss due to inadequate utilization of acquired information after downsampling operations. This limits their ability to accurately extract shape information related to infrared small targets. To address this challenge, we propose a Multi-scale U-shape Pyramid Transformer Network (MUPT-Net). Our network incorporates the U-shape Interaction Module (UIM) and the Multi-scale ViT Module (MSVM) to perform feature extraction. By fully leveraging and integrating the information obtained after each downsampling operation, our approach enables precise extraction of shape information for infrared small targets. Additionally, we introduce the Axial Compression Attention module (ACA), which focuses on capturing the interplay of positional information within the feature map to facilitate accurate detection of small targets. Through iteratively fusing and augmenting multi-scale features, our MUPT-Net effectively assimilates and harnesses contextual information of small targets. Experimental results on the SIRST v1, SIRST v2 and NUDT-SIRST datasets demonstrate the superiority of our approach compared to representative state-of-the-art (SOTA) IRSTD methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call