LaneTD: Lane Feature Aggregator Based on Transformer and Dilated Convolution

Ziyi Chen,Zhijian Yang,Weihua Li

doi:10.1109/jsen.2023.3245805

Abstract

Deep-learning-based algorithms successfully improve the detection accuracy of traffic lanes. Most of them employed stacked convolutional neural networks (CNNs) to extract semantics, but they have trouble gathering global visual information. What is more, the stacked CNNs perform multiple downsampling, which reduces the resolution of feature map, causing the tiny features easy to be ignored. To address these problems, LaneTD is proposed: a novel lane feature aggregator that extracts and fuses local and global features. This method adopts dilated convolution to extract local features of various scales and enhances global representation based on the transformer encoder. The local information and global information are then weighted fused using a bidirectional feature pyramid. In addition, we apply unsupervised image style transfer to improve the performance in nighttime and dark scenarios. The method is quantitatively evaluated on the public challenging CULane dataset, and the result shows that the proposed method significantly improves the accuracy and computational efficiency of lane detection in different scenarios. Furthermore, an ablation study is carried out, and a discussion of efficiency tradeoff choices can be applied in practice.

Full Text