Interpretable Global-Local Dynamics for the Prediction of Eye Fixations in Autonomous Driving Scenarios

Javier Martinez-Cebrian,Miguel-Angel Fernandez-Torres,Fernando Diaz-De-Maria

doi:10.1109/access.2020.3041606

Abstract

Human eye movements while driving reveal that visual attention largely depends on the context in which it occurs. Furthermore, an autonomous vehicle which performs this function would be more reliable if its outputs were understandable. Capsule Networks have been presented as a great opportunity to explore new horizons in the Computer Vision field, due to their capability to structure and relate latent information. In this article, we present a hierarchical approach for the prediction of eye fixations in autonomous driving scenarios. Context-driven visual attention can be modeled by considering different conditions which, in turn, are represented as combinations of several spatio-temporal features. With the aim of learning these conditions, we have built an encoder-decoder network which merges visual features’ information using a global-local definition of capsules. Two types of capsules are distinguished: representational capsules for features and discriminative capsules for conditions. The latter and the use of eye fixations recorded with wearable eye tracking glasses allow the model to learn both to predict contextual conditions and to estimate visual attention, by means of a multi-task loss function. Experiments show how our approach is able to express either frame-level (global) or pixel-wise (local) relationships between features and contextual conditions, allowing for interpretability while maintaining or improving the performance of black-box related systems in the literature. Indeed, our proposal offers an improvement of 29% in terms of information gain with respect to the best performance reported in the literature.

Highlights

T He way contemporary Computer Vision systems represent our world seems progressively further from being understood by humans
In an effort to contribute to visual attention understanding in real settings, we propose a TD system to carry out an autonomous driving task, which is able to offer interpretation about its predictions by means of Capsule Networks [17], [18]
Their performance is significantly lower when we look at Kullback-Leibler Divergence (KL) and Information Gain (IG), which means that probably CC, shuffled Area Under Curve (sAUC) and shuffled variant of Normalized Scanpath Saliency (sNSS) are more saturated metrics, being KL and IG more expressive for our analysis

Summary

Introduction

T He way contemporary Computer Vision systems represent our world seems progressively further from being understood by humans Both the performance and the complexity of feature learning methods, which derives from the application of Deep Learning (DL) and Convolutional Neural Networks (CNNs) to compelling but challenging vision tasks such as object recognition [1] and tracking [2], or anomaly detection in video surveillance scenarios [3], increase at the same time. It is noteworthy the role of eye movements in visual

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 55	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Interpretable Global-Local Dynamics for the Prediction of Eye Fixations in Autonomous Driving Scenarios

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

An interdisciplinary overview and intelligent control of human prosthetic eye movements system for the emotional support by a huggable pet-type robot from a biomechatronical viewpoint
Faezeh Farivar ... Mohammad Teshnehlab
Journal of the Franklin Institute | VOL. 349
Faezeh Farivar, et. al.Faezeh Farivar ... Mohammad Teshnehlab
14 May 2011
Journal of the Franklin Institute | VOL. 349

Overground Walking Decreases Alpha Activity and Entrains Eye Movements in Humans.
Liyu Cao ... Xinyu Chen
Frontiers in Human Neuroscience | VOL. 14
Liyu Cao, et. al.Liyu Cao ... Xinyu Chen
22 Dec 2020
Frontiers in Human Neuroscience | VOL. 14

Two Distinct Types of Eye-Head Coupling in Freely Moving Mice
Arne F Meyer ... Jasper Poort
Current Biology | VOL. 30
Arne F Meyer, et. al.Arne F Meyer ... Jasper Poort
14 May 2020
Current Biology | VOL. 30

人間の視覚・聴覚情報マイクロサッカードの解析に基づく視覚的注意の定量的測定の試み
Takeshi Kohama ... Ken Shinkai
The Journal of the Institute of Image Information and Television Engineers | VOL. 52
Takeshi Kohama, et. al.Takeshi Kohama ... Ken Shinkai
01 Jan 1998
The Journal of the Institute of Image Information and Television Engineers | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interpretable Global-Local Dynamics for the Prediction of Eye Fixations in Autonomous Driving Scenarios

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access