Perspective Transformer and MobileNets-Based 3D Lane Detection from Single 2D Image

Mengyu Li,Kyungeun Cho,Phuong Minh Chu

doi:10.3390/math10193697

Abstract

Three-dimensional (3D) lane detection is widely used in image understanding, image analysis, 3D scene reconstruction, and autonomous driving. Recently, various methods for 3D lane detection from single two-dimensional (2D) images have been proposed to address inaccurate lane layouts in scenarios (e.g., uphill, downhill, and bumps). Many previous studies struggled in solving complex cases involving realistic datasets. In addition, these methods have low accuracy and high computational resource requirements. To solve these problems, we put forward a high-quality method to predict 3D lanes from a single 2D image captured by conventional cameras, which is also cost effective. The proposed method comprises the following three stages. First, a MobileNet model that requires low computational resources was employed to generate multiscale front-view features from a single RGB image. Then, a perspective transformer calculated bird’s eye view (BEV) features from the front-view features. Finally, two convolutional neural networks were used for predicting the 2D and 3D coordinates and respective lane types. The results of the high-reliability experiments verified that our method achieves fast convergence and provides high-quality 3D lanes from single 2D images. Moreover, the proposed method requires no exceptional computational resources, thereby reducing its implementation costs.

Full Text