Abstract

Contemporary segmentation methods are usually based on deep fully convolutional networks (FCNs). However, the layer-by-layer convolutions with a growing receptive field is not good at capturing long-range contexts such as lane markers in the scene. In this paper, we address this issue by designing a distillation method that exploits label structure when training segmentation network. The intuition is that the ground-truth lane annotations themselves exhibit internal structure. We broadcast the structure hints throughout a teacher network, i.e., we train a teacher network that consumes a lane label map as input and attempts to replicate it as output. Then, the attention maps of the teacher network are adopted as supervisors of the student segmentation network. The teacher network, with label structure information embedded, knows distinctly where the convolutional layers should pay visual attention into. The proposed method is named as Label-guided Attention Distillation (LGAD). It turns out that the student network learns significantly better with LGAD than when learning alone. As the teacher network is deprecated after training, our method does not increase the inference time. Note that LGAD can be easily incorporated in any lane segmentation network.To validate the effectiveness of the proposed LGAD method, extensive experiments have been conducted on two popular lane detection benchmarks: TuSimple and CULane. The results show consistent improvement across a variety of convolutional neural network architectures. Specifically, we demonstrate the accuracy boost of LGAD on the lightweight model ENet. It turns out that the ENet-LGAD surpasses existing lane segmentation algorithms. The main contributions of this paper include a newly proposed distillation training strategy (LGAD) and solid experimental investigation of the inner mechanism of LGAD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call