Abstract

The visual perception model is critical to autonomous driving systems. It provides the information necessary for self-driving cars to make decisions in traffic scenes. We propose a lightweight multi-task network (Mobip) to simultaneously perform traffic object detection, drivable area segmentation, and lane line detection. The network consists of a shared encoder for feature extraction and two decoders for handling detection and segmentation tasks collectively. By using MobileNetV2 as the backbone and an extremely efficient multi-task architecture to implement the perception model, our network has great advantages in inference speed. The performance of the multi-task network is verified on a challenging public Berkeley Deep Drive(BDD100K) dataset. The model achieves an inference speed of 58 FPS on NVIDIA Tesla V100 while still maintaining competitive performance on all three tasks compared to other multi-task networks. Besides, the effectiveness and efficiency of the multi-task architecture are verified via ablative studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call