Abstract
Efficient semantic segmentation of large-scale three-dimensional (3D) point clouds is an essential approach for intelligent robots to perceive the surrounding environment. However, due to the expensive sampling process or time-consuming pre/post-processing steps, most of the current solutions are inefficient or limited in scale. In this paper, we propose a novel framework, ThickSeg, to efficiently assign semantic labels for large-scale point clouds. ThickSeg contains three main steps: Firstly, it projects raw point clouds onto a multi-layer image with a random-hit strategy to efficiently preserve more local geometric features. Secondly, the projected multi-layer image is fed into a Self-Sorting 3D Convolutional Neural Network (SS-3DCNN) to predict grid-wise semantics and subsequently project them back to their corresponding 3D points. Finally, the labels of occluded points are determined by an iterative and accumulative post-processing mechanism, avoiding time-consuming explicit 3D neighborhood searching. We validate our approach on two well-known public benchmarks (SemanticKITTI and KITTI), where ThickSeg gets state-of-the-art results and more efficient than previous methods. Our detailed ablation study shows how each component contributes to the final performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have