Abstract

Salient object detection is a fundamental problem in image processing and computer vision. Many saliency detection algorithms based on the background and frequency-domain are used to extract salient object clues. However, the former causes the real object to be submerged in the detected object areas, especially in complex or small object scenes. While the latter will lead to the loss of some object information when detecting large objects. To solve these problems and achieve better object detection results, we propose a fusion framework for salient object detection by fusing background and frequency-domain features. The background features of the image are extracted by an improved background model. This model represents the spatial layout of the image area with respect to the image boundaries. Meanwhile, we present a new frequency-domain processing method to obtain multiscale frequency-domain features and mark the saliency of the object at different scales. Within our framework, inspired by human visual attention, we use the idea of a self-attention mechanism to capture the intrinsic relation between background and multiscale frequency-domain features. In addition, this fusion framework provides a three-dimensional Gaussian convolution kernel, which expands two-dimensional local information to three dimensions for feature fusion, thus producing more accurate salient objects. Experiment results demonstrate that the proposed method consistently outperforms eleven state-of-the-art methods on five challenging and complicated datasets in terms of four evaluation metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call