X-Align: Cross-Modal Cross-View Alignment for Bird’s-Eye-View Segmentation

Shubhankar Borse,Hong Cai,Marvin Klingner,Senthil Yogamani,Varun Ravi Kumar,Abdulaziz Almuzairee,Fatih Porikli

doi:10.1109/wacv56688.2023.00330

Shubhankar Borse, Hong Cai + Show 5 more

Open Access

https://doi.org/10.1109/wacv56688.2023.00330

Copy DOI

Abstract

Bird’s-eye-view (BEV) grid is a typical representation of the perception of road components, e.g., drivable area, in autonomous driving. Most existing approaches rely on cameras only to perform segmentation in BEV space, which is fundamentally constrained by the absence of reliable depth information. The latest works leverage both camera and LiDAR modalities but suboptimally fuse their features using simple, concatenation-based mechanisms.In this paper, we address these problems by enhancing the alignment of the unimodal features in order to aid feature fusion, as well as enhancing the alignment between the cameras’ perspective view (PV) and BEV representations. We propose X-Align, a novel end-to-end cross-modal and cross-view learning framework for BEV segmentation consisting of the following components: (i) a novel CrossModal Feature Alignment (X-FA) loss, (ii) an attentionbased Cross-Modal Feature Fusion (X-FF) module to align multi-modal BEV features implicitly, and (iii) an auxiliary PV segmentation branch with Cross-View Segmentation Alignment (X-SA) losses to improve the PV-to-BEV transformation. We evaluate our proposed method across two commonly used benchmark datasets, i.e., nuScenes and KITTI-360. Notably, X-Align significantly outperforms the state-of-the-art by 3 absolute mIoU points on nuScenes. We also provide extensive ablation studies to demonstrate the effectiveness of the individual components.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

X-Align: Cross-Modal Cross-View Alignment for Bird’s-Eye-View Segmentation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

X-Align++: cross-modal cross-view alignment for Bird’s-eye-view segmentation
Shubhankar Borse ... Marvin Klingner
Machine Vision and Applications | VOL. 34
Shubhankar Borse, et. al.Shubhankar Borse ... Marvin Klingner
16 May 2023
Machine Vision and Applications | VOL. 34

BiFNet: Bidirectional Fusion Network for Road Segmentation.
Haoran Li ... Yaran Chen
IEEE Transactions on Cybernetics | VOL. 52
Haoran Li, et. al.Haoran Li ... Yaran Chen
01 Sep 2022
IEEE Transactions on Cybernetics | VOL. 52

Synergy of Sentinel-1 and Sentinel-2 Imagery for Crop Classification Based on DC-CNN
Kaixin Zhang ... Ning Li
Remote Sensing | VOL. 15
Kaixin Zhang, et. al.Kaixin Zhang ... Ning Li
24 May 2023
Remote Sensing | VOL. 15

MPFFNet: LULC classification model for high-resolution remote sensing images with multi-path feature fusion
Hao Yuan ... Shuwen Yang
International Journal of Remote Sensing | VOL. 44
Hao Yuan, et. al.Hao Yuan ... Shuwen Yang
02 Oct 2023
International Journal of Remote Sensing | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

X-Align: Cross-Modal Cross-View Alignment for Bird’s-Eye-View Segmentation

Abstract

Talk to us

Similar Papers