As a core task in 3D scene information extraction, point cloud semantic segmentation is crucial for understanding 3D scenes and environmental perception. While extracting local geometric structural features from point clouds, existing research often overlooks the long-range dependencies present in the scene, making it challenging to fully uncover the long-range contextual features hidden within point clouds. On this basis, we propose a segmentation algorithm (DG-Net) that integrates dual neighborhood features with global spatial-aware. Initially, the local structure information encoding module is designed to learn about local geometric shapes by encoding spatial position and directional features, thus supplementing structural information. Subsequently, a dual neighborhood features complementary module is introduced to merge the geometric structural and semantic features within local neighborhoods, learning local dependencies and capturing distinguishable local contextual features. Finally, these features are relayed to a global spatial-aware module equipped with a gated unit, which dynamically adjusts the weights of features at different stages, effectively modeling long-range dependencies between local structures and finely extracting long-range contextual features. We conducted experiments on benchmark datasets of point cloud scenes, and both quantitative and qualitative results demonstrate that our algorithm can accurately identify small-scale objects with complex geometric structures within scenes, surpassing other mainstream networks in segmentation performance. The mIoU on the S3DIS, Toronto3D, and SensatUrban datasets are 71.9 %, 82.1 %, and 59.8 %, respectively.
Read full abstract