Abstract

Efficient semantic segmentation of large-scale three-dimensional (3D) point clouds is an essential approach for intelligent robots to perceive the surrounding environment. However, due to the expensive sampling process or time-consuming pre/post-processing steps, most of the current solutions are inefficient or limited in scale. In this paper, we propose a novel framework, ThickSeg, to efficiently assign semantic labels for large-scale point clouds. ThickSeg contains three main steps: Firstly, it projects raw point clouds onto a multi-layer image with a random-hit strategy to efficiently preserve more local geometric features. Secondly, the projected multi-layer image is fed into a Self-Sorting 3D Convolutional Neural Network (SS-3DCNN) to predict grid-wise semantics and subsequently project them back to their corresponding 3D points. Finally, the labels of occluded points are determined by an iterative and accumulative post-processing mechanism, avoiding time-consuming explicit 3D neighborhood searching. We validate our approach on two well-known public benchmarks (SemanticKITTI and KITTI), where ThickSeg gets state-of-the-art results and more efficient than previous methods. Our detailed ablation study shows how each component contributes to the final performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call