Abstract
An important aspect of visual saliency detection is how features that form an input image are represented. A popular theory supports sparse feature representation, an image being represented with a basis dictionary having sparse weighting coefficient. Another method uses a nonlinear combination of image features for representation. In our work, we combine the two methods and propose a scheme that takes advantage of both sparse and nonlinear feature representation. To this end, we use independent component analysis (ICA) and covariant matrices, respectively. To compute saliency, we use a biologically plausible center surround difference (CSD) mechanism. Our sparse features are adaptive in nature; the ICA basis function are learnt at every image representation, rather than being fixed. We show that Adaptive Sparse Features when used with a CSD mechanism yield better results compared to fixed sparse representations. We also show that covariant matrices consisting of nonlinear integration of color information alone are sufficient to efficiently estimate saliency from an image. The proposed dual representation scheme is then evaluated against human eye fixation prediction, response to psychological patterns, and salient object detection on well-known datasets. We conclude that having two forms of representation compliments one another and results in better saliency detection.
Highlights
Vision is the primary source of information that the human brain uses to understand the environment it operates in
Our approach is a modification of the model proposed in [17], where all features and channels are nonlinearly integrated using covariance matrices, we propose that only color information is enough and it can better estimate saliency in our framework
We found that the standard deviation of the sAUC approximately ranges from 1E − 4 to 5E − 4 in our experiments
Summary
Vision is the primary source of information that the human brain uses to understand the environment it operates in. In order to efficiently process such huge amount of information, the brain uses visual attention to seek out only the most salient regions in the visual field. A number of computational algorithms are designed based on visual attention in primates. Such visual saliency models have shown reasonable performance and are used in many applications like robot localization [1], salient object detection [2], object tracking [3], video compression [4], thumbnail generation [5], and so forth. A detailed discussion on the subject can be found in [6]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have