Abstract
Semantic segmentation of LIDAR point clouds is essential for autonomous driving. However, current methods often suffer from low segmentation accuracy and feature redundancy. To address these issues, this paper proposes a novel approach based on adaptive fusion of multi-scale sparse convolution and point convolution. First, addressing the drawbacks of redundant feature extraction with existing sparse 3D convolutions, we introduce an asymmetric importance of space locations (IoSL) sparse 3D convolution module. By prioritizing the importance of input feature positions, this module enhances the sparse learning performance of critical feature information. Additionally, it strengthens the extraction capability of intrinsic feature information in both vertical and horizontal directions. Second, to mitigate significant differences between single-type and single-scale features, we propose a multi-scale feature fusion cross-gating module. This module employs gating mechanisms to improve fusion accuracy between different scale receptive fields. It utilizes a cross self-attention mechanism to adapt to the unique propagation features of point features and voxels, enhancing feature fusion performance. Experimental comparisons and ablation studies conducted on the SemanticKITTI and nuScenes datasets validate the generality and effectiveness of the proposed approach. Compared with state-of-the-art methods, our approach significantly improves accuracy and robustness.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have