Abstract

Object-level saliency detection is an attractive research field which is useful for many content-based computer vision and remote-sensing tasks. This paper introduces an efficient unsupervised approach to salient object detection from the perspective of recursive sparse representation. The reconstruction error determined by foreground and background dictionaries other than common local and global contrasts is used as the saliency indication, by which the shortcomings of the object integrity can be effectively improved. The proposed method consists of the following four steps: (1) regional feature extraction; (2) background and foreground dictionaries extraction according to the initial saliency map and image boundary constraints; (3) sparse representation and saliency measurement; and (4) recursive processing with a current saliency map updating the initial saliency map in step 2 and repeating step 3. This paper also presents the experimental results of the proposed method compared with seven state-of-the-art saliency detection methods using three benchmark datasets, as well as some satellite and unmanned aerial vehicle remote-sensing images, which confirmed that the proposed method was more effective than current methods and could achieve more favorable performance in the detection of multiple objects as well as maintaining the integrity of the object area.

Highlights

  • Visual saliency, which is an important and fundamental research problem in remote-sensing image interpretation, computer vision, psychology and neuroscience, is concerned with the distinct perceptual quality of a biological system that makes certain regions of a scene stand out from their neighbors and thereby helps humans to quickly and accurately focus on the most visually noticeable foreground in a scene [1,2,3,4]

  • The related methods are categorized as either eye-fixation prediction methods or salient object detection methods [18], depending on the specific needs and objectives

  • The aim of the eye-fixation prediction methods is to generate a pixel-wise saliency prediction map that is based on a biological model for human eye-fixation activity; the salient object detection methods aim to create a region-level saliency map for the purpose of object appearance preservation [19]

Read more

Summary

Introduction

Visual saliency, which is an important and fundamental research problem in remote-sensing image interpretation, computer vision, psychology and neuroscience, is concerned with the distinct perceptual quality of a biological system that makes certain regions of a scene stand out from their neighbors and thereby helps humans to quickly and accurately focus on the most visually noticeable foreground in a scene [1,2,3,4]. The supervised learning-based object detection methods attract great attention and can obtain outstanding results with the booming development of deep-learning technology [20,21,22], there is still enough research value for the unsupervised methods because of their autonomy and adaptability. The proposed method combines the background and foreground dictionaries to complete the detection task by sparse representation in order to alleviate these disadvantages. There are two separate sparse representation processing steps in this study: background-based and foreground-based. The eye-fixation result is used to restrict the extraction of background regions for adaption to cases with salient objects touching the boundaries, and it is the basis for foreground dictionary building.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call