Abstract

• We propose an end-to-end network for salient object detection by combining the asynchronous event data and the RGB image. • We utilize Long Short-Term Memory and multi-level feature interaction modules to complement the image and event data. • We build an Event-RGB SOD dataset (ERSOD), and make it available to the research community. Salient object detection (SOD) focuses on mimicking the attention mechanism in human vision system. Due to the limited information provided from the traditional image data, the image-based SOD advances are very challenging in some complex scenes. Inspired by the emerging event cameras that provide asynchronous measurements of local temporal contrast over a large dynamic range, we propose a new idea to extract more effective information from the combination of event flow and RGB images. In this paper, we construct an end-to-end joint network for salient object detection (ERSOD-Net), which simultaneously supervise the RGB image and the event data within the corresponding image exposure time. To fully exploit temporal information of event data, Long Short-Term Memory module is utilized to effectively process event and learn salient object event surfaces. Moreover, multi-level feature interaction is designed to fuse two complementary branches of image and event and to predict significance mapping. Finally, to demonstrate the effectiveness of our model, a real Event-RGB SOD dataset (ERSOD) is built by DAVIS camera. Experiments on both benchmark and ERSOD datasets show that the proposed event guided network greatly improves the SOD performance in different evaluation metrics. The code and datasets will be released later at https://github.com/jxr326/ERSOD-Net.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call