Abstract
With the development of 3D sensors, complex spatial structures can be established from point cloud data. Accurate semantic segmentation of 3D point clouds is beneficial for tasks such as robotic scene understanding. To extract the geometric features of the point cloud, most prior works use the xyz coordinates to potentially learn the shapes of 3D objects, which is not insufficient to handle complex scenes. In this paper, a Local Geometry Encoding (LGE) module is proposed to provide local normal information and express 3D shapes explicitly. Furthermore, considering that most prior methods use a single decoder to recover from abstract features, a Multi-Decoder (MD) ensemble module is proposed to restore multiple segmentation results from various scale feature maps in the encoder, and then a multi-attention fusion mechanism is utilized to produce single segmentation results. Benefiting from the local geometry encoding module and Multi-Decoder ensemble module, our network performs 6-fold cross-validation on the S3DIS dataset and achieves 75% mIoU with fewer parameters.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have