Abstract

A significant performance boost has been achieved in point cloud semantic segmentation by utilization of the encoder-decoder architecture and novel convolution operations for point clouds. However, co-occurrence relationships within a local region which can directly influence segmentation results are usually ignored by current works. In this paper, we propose a neighborhood co-occurrence matrix (NCM) to model local co-occurrence relationships in a point cloud. We generate target NCM and prediction NCM from semantic labels and a prediction map respectively. Then, Kullback-Leibler (KL) divergence is used to maximize the similarity between the target and prediction NCMs to learn the co-occurrence relationship. Moreover, for large scenes where the NCMs for a sampled point cloud and the whole scene differ greatly, we introduce a reverse form of KL divergence which can better handle the difference to supervise the prediction NCMs. We integrate our method into an existing backbone and conduct comprehensive experiments on three datasets: Semantic3D for outdoor space segmentation, and S3DIS and ScanNet v2 for indoor scene segmentation. Results indicate that our method can significantly improve upon the backbone and outperform many leading competitors.

Highlights

  • With advances in scanning devices, much 3D data has been produced and widely used in augmented and virtual reality, 3D games, and robotics

  • Inspired by the design of co-occurrence matrices for words, we propose a neighborhood co-occurrence matrix to model the relationship of neighboring co-occurring categories in local regions of point clouds

  • We report the results of our method and many stateof-the-art competitors on S3DIS Area-5 in Table 2; mean IoU is taken as a metric to evaluate segmentation performance

Read more

Summary

Introduction

With advances in scanning devices, much 3D data has been produced and widely used in augmented and virtual reality, 3D games, and robotics. Semantic segmentation of point clouds is an essential 3D scene comprehension task yet remains challenging due to its inherent irregularity [2]. PointNet [3] was the first neural network to directly process point clouds for 3D segmentation. They proposed to apply shared multi-layer perceptrons (MLPs) to point clouds to learn point-wise features and utilized max/mean pooling to aggregate global features. These were concatenated with point-wise features before a few MLPs were used for final semantic segmentation. While global contextual information and local region information are both used for point-wise labeling, local cooccurrence relationships are usually ignored or used in a implicit way

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call