Abstract

Infrared small target detection (IRSTD) is important for many practical applications such as hazardous aircraft warning, especially when the target is not visible in visible light image due to atmospheric conditions such as fog and cloud. However, IRSTD is challenging due to noises, small and dim targets. To address this challenge, we propose a novel Dim2Clear Network (Dim2Clear) for IRSTD in this paper. Specifically, the Dim2Clear consists of a U-Net backbone encoder, a context mixer decoder (CMD) based on spatial and frequency attention (SFA), and an eyeball-shaped enhancement module (EEM). The CMD is composed of cascaded regular residual blocks where two SFA modules are inserted. Each SFA module receives features from different residual blocks and generates spatial attention map from them to modulate the low-level features, which are then decomposed into low and high frequencies using the discrete cosine transformation. Accordingly, features are further modulated according to the generated frequency attention maps. In this way, SFA can extract both spatial context and frequency context to improve the feature representation capacity. In addition, we design an EEM to suppress the noise and enhance the signal-to-noise ratio in the segmentation results from the perspective of image super-resolution. Experiments on the SIRST dataset and our newly constructed IRSTD-1k dataset show that the proposed Dim2Clear outperforms state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.