Abstract

In this study, we introduce a novel framework for the semantic segmentation of point clouds in autonomous driving scenarios, termed PVI-Net. This framework uniquely integrates three different data perspectives—point clouds, voxels, and distance maps—executing feature extraction through three parallel branches. Throughout this process, we ingeniously design a point cloud–voxel cross-attention mechanism and a multi-perspective feature fusion strategy for point images. These strategies facilitate information interaction across different feature dimensions of perspectives, thereby optimizing the fusion of information from various viewpoints and significantly enhancing the overall performance of the model. The network employs a U-Net structure and residual connections, effectively merging and encoding information to improve the precision and efficiency of semantic segmentation. We validated the performance of PVI-Net on the SemanticKITTI and nuScenes datasets. The results demonstrate that PVI-Net surpasses most of the previous methods in various performance metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call