Abstract

Human visual attention performs the spatio-temporal regularity, an inherent regularity in dynamic vision, for the natural visual tasks. However, recent computational visual attention models still have deficiency to reflect the spatio-temporal regularity thoroughly. Motivated by this, we propose a bio-inspired motion sensitive model to estimate human gaze positions in driving. The proposed model has four key advantages. First, inspired by two types of motion sensitive neurons, this model can perceive the motion cues from different directions spatially and temporally. Second, compared to conventional deep learning based models, this model does not rely on expensive training samples with gaze annotations. Third, the proposed model is based upon the visual signal processing without constructing a complex deep neural network architecture, which enables this model to be implemented with low-cost hardware. Fourth, inspired by the visual pathway in drosophila motion vision, the visual signal processing on directional and depth motion sensitive map largely enhances this model's competence in a similar way of human gaze positions in driving. To test this proposed model, we collect a driving scene dataset from the perspective of egocentric vision that aims to systematically evaluate the performance of spatio-temporal visual attentional models. The video clips in the dataset are categorized into ten popular driving conditions. The proposed model is evaluated by comparing with the human baseline of the gaze positions. Experimental results demonstrate that the proposed model can effectively estimate the human gaze positions in driving and consistently outperforms traditional visual attention models as well as the deep learning based model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call