Abstract

The human gaze is directed at various locations from moment to moment in acquiring information necessary to recognize the external environment at the fine resolution of foveal vision. Previous studies showed that the human gaze is attracted to particular locations in the visual field at a particular time, but it remains unclear what visual features produce such spatiotemporal bias. In this study, we used a deep convolutional neural network model to extract hierarchical visual features from natural scene images and evaluated how much the human gaze is attracted to the visual features in space and time. Eye movement measurement and visual feature analysis using the deep convolutional neural network model showed that the gaze was more strongly attracted to spatial locations containing higher-order visual features than to locations containing lower-order visual features or to locations predicted by conventional saliency. Analysis of the time course of gaze attraction revealed that the bias to higher-order visual features was prominent within a short period after the beginning of observation of the natural scene images. These results demonstrate that higher-order visual features are a strong gaze attractor in both space and time, suggesting that the human visual system uses foveal vision resources to extract information from higher-order visual features with higher spatiotemporal priority.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call