Abstract

Semantic information is important in eye movement control. An important semantic influence on gaze guidance relates to object-scene relationships: objects that are semantically inconsistent with the scene attract more fixations than consistent objects. One interpretation of this effect is that fixations are driven toward inconsistent objects because they are semantically more informative. We tested this explanation using contextualized meaning maps, a method that is based on crowd-sourced ratings to quantify the spatial distribution of context-sensitive “meaning” in images. In Experiment 1, we compared gaze data and contextualized meaning maps for images, in which objects-scene consistency was manipulated. Observers fixated more on inconsistent versus consistent objects. However, contextualized meaning maps did not assign higher meaning to image regions that contained semantic inconsistencies. In Experiment 2, a large number of raters evaluated image-regions, which were deliberately selected for their content and expected meaningfulness. The results suggest that the same scene locations were experienced as slightly less meaningful when they contained inconsistent compared to consistent objects. In summary, we demonstrated that — in the context of our rating task — semantically inconsistent objects are experienced as less meaningful than their consistent counterparts and that contextualized meaning maps do not capture prototypical influences of image meaning on gaze guidance.

Highlights

  • Visual processing varies as a function of the retinal location at which a stimulus is presented: with increasing eccentricity, processing is affected by crowding and a decrease in resolution

  • In the Inconsistent condition, one of these objects was replaced with an object unusual in the context provided by the whole scene, introducing a semantic inconsistency

  • Contrary to predictions of the meaning map approach, our results provided no evidence that contextualized meaning maps assign more meaning to inconsistent than consistent objects

Read more

Summary

Introduction

Visual processing varies as a function of the retinal location at which a stimulus is presented: with increasing eccentricity, processing is affected by crowding and a decrease in resolution (see Rosenholtz, 2016 and Stewart et al, 2020 for reviews). Being able to rapidly move the central parts of the eyes is necessary to extract fine detail across large parts of the visual field. Eye movements are critical for visual processing and it is important to understand what processes underpin gaze guidance. Support for the notion that image-computable aspects of the input are important for the guidance of eye movements comes from studies demonstrating that where humans look in images can often be predicted by analyzing the visual features of these images (Borji et al., 2013). Algorithms generating such predictions are called saliency models. Saliency models, such as GBVS (Harel et al, 2007), AWS (Garcia-Diaz, Fdez-Vidal, et al, 2012; Garcia-Diaz, Leboran, et al, 2012) or the model by Itti and Koch

Objectives
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call