What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition

Tom Foulsham,Geoffrey Underwood

doi:10.1167/8.2.6

Abstract

Saliency map models account for a small but significant amount of the variance in where people fixate, but evaluating these models with natural stimuli has led to mixed results. In the present study, the eye movements of participants were recorded while they viewed color photographs of natural scenes in preparation for a memory test (encoding) and when recognizing them later. These eye movements were then compared to the predictions of a well defined saliency map model (L. Itti & C. Koch, 2000), in terms of both individual fixation locations and fixation sequences (scanpaths). The saliency model is a significantly better predictor of fixation location than random models that take into account bias toward central fixations, and this is the case at both encoding and recognition. However, similarity between scanpaths made at multiple viewings of the same stimulus suggests that repetitive scanpaths also contribute to where people look. Top-down recapitulation of scanpaths is a key prediction of scanpath theory (D. Noton & L. Stark, 1971), but it might also be explained by bottom-up guidance. The present data suggest that saliency cannot account for scanpaths and that incorporating these sequences could improve model predictions.

Full Text