Abstract

Most red–green–blue and thermal (RGB-T) salient object detection methods require high memory consumption and incur large computational costs, which limit their applicability. To alleviate the computational resource requirements, we propose a lightweight multimodality enhancement network (MENet) for RGB-T salient object detection with relatively few parameters. As the RGB and thermal images are from different domains, the modality gap leads to unsatisfactory results if using simple feature concatenation. Instead, we introduce a multimodality complementary enhancement module with a nested residual structure to fuse RGB and thermal features. During decoding, we design a recursive sharpening module that is inspired by atrous spatial pyramid pooling and dense connections and has fewer parameters than similar methods. By reducing the number of parameters and computations of each component, MENet ends up having only 21.26 M parameters and running with 2.501G floating-point operations. Experiments were performed on three benchmark datasets. The results show that the proposed MENet can consistently outperform 16 state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.