Modeling visual saliency map of an image provides important information for image semantic understanding in many applications. Most existing computational visual saliency models follow a bottom-up framework that generates independent saliency map in each selected visual feature space and combines them in a proper way. Two big challenges to be addressed explicitly in these methods are (1) which features should be extracted for all pixels of the input image and (2) how to dynamically determine importance of the saliency map generated in each feature space. In order to address these problems, we present a novel saliency map computational model based on tensor decomposition and reconstruction. Tensor representation and analysis not only explicitly represent image's color values but also imply two important relationships inherent to color image. One is reflecting spatial correlations between pixels and the other one is representing interplay between color channels. Therefore, saliency map generator based on the proposed model can adaptively find the most suitable features and their combinational coefficients for each pixel. Experiments on a synthetic image set and a real image set show that our method is superior or comparable to other prevailing saliency map models.
Read full abstract