Abstract

Camouflaged Object Detection (COD) is a critical task in a variety of domains, such as medicine and military applications. The main challenge in COD is accurately detecting and extracting the concealed object from the complex background. The similarity between the camouflaged objects and their background significantly reduces the accuracy of object extraction. Polarization information can provide valuable insights into the characteristics of objects with different material properties and surface roughness. It reflects the difference in polarization information between the object and the background, which increases the contrast between the two and improves the object detection accuracy even under complex scenes. In this paper, we propose IPNet, an efficient cross-modal fusion network that utilizes both RGB intensity and linear polarization cues to generate scene representation with high contrast. Our novel network architecture dynamically fuses RGB intensity and polarization cues using an efficient cross-modal fusion module, leveraging cross-level contextual information to achieve robust detection. For training and evaluating the proposed network, we construct a polarization-based PCOD_1200 dataset that contains 89 subclasses and 1200 samples. A comprehensive set of experiments demonstrates the effectiveness of IPNet to fuse polarization and RGB intensity information and shows that our approach outperforms state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call