Abstract

With a larger field of view (FOV) than ordinary images, fisheye images are becoming mainstream in the field of autonomous driving. However, the severe distortion problem of fisheye images also limits its application. The performance of neural networks designed for narrow FOV images degrades drastically for fisheye images, and the use of large composite models can improve the performance, but it brings huge time overhead and hardware costs. Therefore, we decided to balance real time and accuracy by designing the deformable segmentation attention(DSA) module, a generalpurpose architecture based on a deformable attention mechanism and a spatial pyramid architecture. The deformable mechanism serves to accurately extract feature information from fisheye images, together with attention to learn the global context and the spatial pyramid structure to balance multiscale feature information, thus improving the perception of fisheye images by traditional networks without increasing the amount of excessive computation. Lightweight networks such as SegNeXt equipped with the DSA module enable effective and rapid multi-scale segmentation of fisheye images in complex scenes. Our architecture achieves outstanding results on the WoodScape dataset, while our ablation experiments demonstrate the effectiveness of various parts of the architecture.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call