Enhancing scene understanding based on deep learning for end-to-end autonomous driving

Jie Hu,Huifang Kong,Qian Zhang,Runwu Liu

doi:10.1016/j.engappai.2022.105474

Abstract

Efficient understanding of the environment is a crucial prerequisite for autonomous driving, but explicitly modeling the environment is hard to come true. In contrast, imitation learning, in theory, can arrive at the direct mapping from visual input to driving command, but the inscrutability of scene representation in imitation learning is still a challenging problem. In this paper, we propose to enhance the abstract representation of visual scene from two aspects for better scene understanding, i.e. Visual Guide path and Driving Affordances path. For Visual Guide path, we leverage semantic information as visual priors to learn the intuitive state of the environment, e.g. the spatial semantic occupation of the visual scene. For Driving Affordances path, several driving affordance indicators reflecting the relationship between environment and vehicle behavior are learned as the global guidance to guide the driving system to learn safe and efficient driving policies. With the complementarity of these two paths, a Bilateral Guide Network is designed to realize the complete mapping from visual input to driving command. Our method is evaluated on the CARLA simulator with various scenarios to demonstrate the effectiveness. Besides, comparative analyses are made with some state-of-the-art methods to justify the performance of our method in the aspect of autonomous driving.

Full Text