Abstract

Interpretation of Airborne Laser Scanning (ALS) point clouds is a critical procedure for producing various geo-information products like 3D city models, digital terrain models and land use maps. In this paper, we present a local and global encoder network (LGENet) for semantic segmentation of ALS point clouds. Adapting the KPConv network, we first extract features by both 2D and 3D point convolutions to allow the network to learn more representative local geometry. Then global encoders are used in the network to exploit contextual information at the object and point level. We design a segment-based Edge Conditioned Convolution to encode the global context between segments. We apply a spatial-channel attention module at the end of the network, which not only captures the global interdependencies between points but also models interactions between channels. We evaluate our method on two ALS datasets namely, the ISPRS benchmark dataset and DCF2019 dataset. For the ISPRS benchmark dataset, our model achieves state-of-the-art results with an overall accuracy of 0.845 and an average F1 score of 0.737. With regards to the DFC2019 dataset, our proposed network achieves an overall accuracy of 0.984 and an average F1 score of 0.834.

Highlights

  • With the advanced techniques of light detection and ranging (LiDAR) systems, point clouds are more obtained in various scenes

  • The involvement of contextual information between points has been proven to be effective in improving semantic segmentation results and this can be achieved by using graphical models such as Conditional Random Field (CRF) (Niemeyer et al, 2016; Vosselman et al, 2017)

  • Experiments with PointNet++ backbone In this paper, we mainly focus on adapting the Kernel Point Convolutions (KPConv) network, Table 11 Quantitative comparison of classification results of PointNet++, PointNet++ with a segment-based Edge Conditioned Convolution (SegECC) layer and PointNet++ with a SegECC layer and a spatial-channel attention on the ISPRS benchmark dataset

Read more

Summary

Introduction

With the advanced techniques of light detection and ranging (LiDAR) systems, point clouds are more obtained in various scenes. The involvement of contextual information between points has been proven to be effective in improving semantic segmentation results and this can be achieved by using graphical models such as Conditional Random Field (CRF) (Niemeyer et al, 2016; Vosselman et al, 2017) In these methods, low dimensional handcrafted features are not representative to distinguish all categories in the dataset especially for the ALS point clouds acquired over complicated scenes where objects are largely different in size. Motivated by the self-attention module proposed by Vaswani et al (2017) for machine translation, various other approaches adapt this concept for computer vision tasks like semantic segmentation of images (Wang et al, 2018) and point clouds (Feng et al, 2020) These methods explore dependencies between pixels or points, ignoring relationships between objects which are informative for large scale complex outdoor scenes.

Traditional methods
Deep learning methods
Attention models
Method
Hybrid convolution block
SegECC
Spatial-channel attention
Overall network architecture
Experiments
Experiments on ISPRS benchmark dataset
Ablation study
Experiments on DFC2019 dataset
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call