Finding the most interesting areas of an image is the aim of saliency detection. Conventional methods based on low-level features rely on biological cues like texture and color. These methods, however, have trouble with processing complicated or low-contrast images. In this paper, we introduce a deep neural network-based saliency detection method. First, using semantic segmentation, we construct a pixel-level model that gives each pixel a saliency value depending on its semantic category. Next, we create a region feature model by combining both hand-crafted and deep features, which extracts and fuses the local and global information of each superpixel region. Third, we combine the results from the previous two steps, along with the over-segmented superpixel images and the original images, to construct a multi-level feature model. We feed the model into a deep convolutional network, which generates the final saliency map by learning to integrate the macro and micro information based on the pixels and superpixels. We assess our method on five benchmark datasets and contrast it against 14 state-of-the-art saliency detection algorithms. According to the experimental results, our method performs better than the other methods in terms of F-measure, precision, recall, and runtime. Additionally, we analyze the limitations of our method and propose potential future developments.
Read full abstract