Directionally constrained fully convolutional neural network for airborne LiDAR point cloud classification

Congcong Wen,Lina Yang,Xiang Li,Ling Peng,Tianhe Chi

doi:10.1016/j.isprsjprs.2020.02.004

Abstract

Point cloud classification plays an important role in a wide range of airborne light detection and ranging (LiDAR) applications, such as topographic mapping, forest monitoring, power line detection, and road detection. However, due to the sensor noise, high redundancy, incompleteness, and complexity of airborne LiDAR systems, point cloud classification is challenging. Traditional point cloud classification methods mostly focus on the development of handcrafted point geometry features and employ machine learning-based classification models to conduct point classification. In recent years, the advances of deep learning models have caused researchers to shift their focus towards machine learning-based models, specifically deep neural networks, to classify airborne LiDAR point clouds. These learning-based methods start by transforming the unstructured 3D point sets to regular 2D representations, such as collections of feature images, and then employ a 2D CNN for point classification. Moreover, these methods usually need to calculate additional local geometry features, such as planarity, sphericity and roughness, to make use of the local structural information in the original 3D space. Nonetheless, the 3D to 2D conversion results in information loss. In this paper, we propose a directionally constrained fully convolutional neural network (D-FCN) that can take the original 3D coordinates and LiDAR intensity as input; thus, it can directly apply to unstructured 3D point clouds for semantic labeling. Specifically, we first introduce a novel directionally constrained point convolution (D-Conv) module to extract locally representative features of 3D point sets from the projected 2D receptive fields. To make full use of the orientation information of neighborhood points, the proposed D-Conv module performs convolution in an orientation-aware manner by using a directionally constrained nearest neighborhood search. Then, we design a multiscale fully convolutional neural network with downsampling and upsampling blocks to enable multiscale point feature learning. The proposed D-FCN model can therefore process input point cloud with arbitrary sizes and directly predict the semantic labels for all the input points in an end-to-end manner. Without involving additional geometry features as input, the proposed method demonstrates superior performance on the International Society for Photogrammetry and Remote Sensing (ISPRS) 3D labeling benchmark dataset. The results show that our model achieves a new state-of-the-art performance on powerline, car, and facade categories. Moreover, to demonstrate the generalization abilities of the proposed method, we conduct further experiments on the 2019 Data Fusion Contest Dataset. Our proposed method achieves superior performance than the comparing methods and accomplishes an overall accuracy of 95.6% and an average F1 score of 0.810.

Full Text