Abstract

In this work, we propose an alternative ground truth to the eye fixation map in visual attention study, called touch saliency. As it can be directly collected from the recorded data of users' daily browsing behavior on widely used smart phone devices with touch screens, the touch saliency data is easy to obtain. Due to the limited screen size, smart phone users usually move and zoom in the images, and fix the region of interest on the screen when browsing images. Our studies are two-fold. First, we collect and study the characteristics of these touch screen fixation maps (named touch saliency) by comprehensive comparisons with their counterpart, the eye-fixation maps (namely, visual saliency). The comparisons show that the touch saliency is highly correlated with the eye fixations for the same stimuli, which indicates its utility in data collection for visual attention study. Based on the consistency between both touch saliency and visual saliency, our second task is to propose a unified saliency prediction model for both visual and touch saliency detection. This model utilizes middle-level object category features extracted from pre-segmented image superpixels as input to the recently proposed multitask sparsity pursuit (MTSP) framework for saliency prediction. Extensive evaluations show that the proposed middle-level category features can considerably improve the saliency prediction performance when taking both touch saliency and visual saliency as ground truth.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call