Abstract

Three-dimensional (3D) semantic segmentation of point clouds is important in many scenarios, such as automatic driving, robotic navigation, while edge computing is indispensable in the devices. Deep learning methods based on point sampling prove to be computation and memory efficient to tackle large-scale point clouds (e.g. millions of points). However, some local features may be abandoned while sampling. In this paper, We present one end-to-end 3D semantic segmentation framework based on dilated nearest neighbor encoding. Instead of down-sampling point cloud directly, we propose a dilated nearest neighbor encoding module to broaden the network’s receptive field to learn more 3D geometric information. Without increase of network parameters, our method is computation and memory efficient for large-scale point clouds. We have evaluated the dilated nearest neighbor encoding in two different networks. The first is the random sampling with local feature aggregation. The second is the Point Transformer. We have evaluated the quality of the semantic segmentation on the benchmark 3D dataset S3DIS, and demonstrate that the proposed dilated nearest neighbor encoding exhibited stable advantages over baseline and competing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call