Abstract
Saliency maps produced by different algorithms are often evaluated by comparing output to fixated image locations appearing in human eye tracking data. There are challenges in evaluation based on fixation data due to bias in the data. Properties of eye movement patterns that are independent of image content may limit the validity of evaluation results, including spatial bias in fixation data. To address this problem, we present modeling and evaluation results for data derived from different perceptual tasks related to the concept of saliency. We also present a novel approach to benchmarking to deal with some of the challenges posed by spatial bias. The results presented establish the value of alternatives to fixation data to drive improvement and development of models. We also demonstrate an approach to approximate the output of alternative perceptual tasks based on computational saliency and/or eye gaze data. As a whole, this work presents novel benchmarking results and methods, establishes a new performance baseline for perceptual tasks that provide an alternative window into visual saliency, and demonstrates the capacity for saliency to serve in approximating human behaviour for one visual task given data from another.
Highlights
For many saliency algorithms, the goal is to approximate fixation locations in eye tracking data derived from many human observers
We believe that improving the prediction of explicit judgments is likely more prudent than improvements to performance for the traditional fixation tasks for several reasons: Explicit judgment captures the most salient locations within a scene, through a selection process that is less clouded by noise from spatial bias and fixation mechanisms, and with more relation to content relevant to the role of saliency as in applications in computer vision and multimedia
While some sense of this is already provided in our benchmarking results, we further examine the strength of predictions that may be achieved through an ensemble approach that relies on existing saliency algorithms
Summary
The goal is to approximate fixation locations in eye tracking data derived from many human observers. Visual Saliency for Different Perceptual Tasks is required to derive a suitably sized pool of data, and due to individual variation a variety of locations within a scene will typically be selected as most salient across observers. Saliency maps produced by predictive algorithms might carry the goal of approximating locations selected via explicit judgment rather than fixations in eye tracking data for the reasons mentioned. We propose a method for performance evaluation that corrects for data bias in a manner that is distinct from prior work This is shown to produce greater consistency in evaluation results, and controls for spatial bias in fixation data, and non-uniform importance weighting inherent in existing evaluation methods
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.