Abstract

Pixel-wise semantic segmentation is capable of unifying most of driving scene perception tasks, and has enabled striking progress in the context of navigation assistance, where an entire surrounding sensing is vital. However, current mainstream semantic segmenters are predominantly benchmarked against datasets featuring narrow Field of View (FoV), and a large part of vision-based intelligent vehicles use only a forward-facing camera. In this paper, we propose a Panoramic Annular Semantic Segmentation (PASS) framework to perceive the whole surrounding based on a compact panoramic annular lens system and an online panorama unfolding process. To facilitate the training of PASS models, we leverage conventional FoV imaging datasets, bypassing the efforts entailed to create fully dense panoramic annotations. To consistently exploit the rich contextual cues in the unfolded panorama, we adapt our real-time ERF-PSPNet to predict semantically meaningful feature maps in different segments, and fuse them to fulfill panoramic scene parsing. The innovation lies in the network adaptation to enable smooth and seamless segmentation, combined with an extended set of heterogeneous data augmentations to attain robustness in panoramic imagery. A comprehensive variety of experiments demonstrates the effectiveness for real-world surrounding perception in a single PASS, while the adaptation proposal is exceptionally positive for state-of-the-art efficient networks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call