Abstract

Visual perception plays an important role in industrial information field, especially in robotic grasping application. In order to detect the object to be grasped quickly and accurately, salient object detection (SOD) is employed to the above task. Although the existing SOD methods have achieved impressive performance, they still have some limitations in the complex interference environment of practical application. To better deal with the complex interference environment, a novel triple-modal images fusion strategy is proposed to implement SOD for robotic visual perception, namely visible-depth-thermal (VDT) SOD. Meanwhile, we build an image acquisition system under variable lighting scene and construct a novel benchmark dataset for VDT SOD (VDT-2048 dataset). Multiple modal images will be introduced to assist each other to highlight the salient regions. But, inevitably, interference will also be introduced. In order to achieve effective cross-modal feature fusion while suppressing information interference, a hierarchical weighted suppress interference (HWSI) method is proposed. The comprehensive experimental results prove that our method achieves better performance than the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call