Abstract

The task of salient object detection is to find the most noticeable areas in the image. On the one hand, most of the existing RGB-D saliency object detection requires additional networks to process depth features, and the sub-networks that process depth features rely too much on the RGB network, resulting in higher computational costs. On the other hand, when dealing with multi-scale features, most models tend to produce information loss, the semantic information representation ability is weak, and good detection results cannot be achieved, which limits its practical application. Firstly, this paper proposes a strategy of aggregation and interaction to extract edge features, depth features and salient features, while maintaining local details, fully extracting global information. Secondly, in the learning process of high-level features, depth features and salient features are extracted at the same time, which reduces the complexity of the network and does not require the additional sub-networks. On the other hand, the deformable convolution network is used to solve the multi-scale problem to ensure that more detailed feature information is extracted. Finally, taking advantage of the complementarity between features, using the one-to-one feature fusion module, the problem of information redundancy in the feature fusion process is solved, and the fused features can accurately locate the salient results with clear details. The experimental results on six datasets show that compared with other state-of-the-art algorithms, the algorithm in this paper has excellent performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call