Abstract

Accurate semantic segmentation of 3D point clouds is a long-standing problem in remote sensing and computer vision. Due to the unstructured nature of point clouds, designing deep neural architectures for point cloud semantic segmentation is often not straightforward. In this work, we circumvent this problem by devising a technique to exploit structured neural architectures for unstructured data. In particular, we employ the popular convolutional neural network (CNN) architectures to perform semantic segmentation of LiDAR data. We propose a projection-based scheme that performs an angle-wise slicing of large 3D point clouds and transforms those slices into 2D grids. Accounting for intensity and reflectivity of the LiDAR input, the 2D grid allows us to construct a pseudo image for the point cloud slice. We enhance this image with low-level image processing techniques of normalization, histogram equalization, and decorrelation stretch to suit our ultimate object of semantic segmentation. A large number of images thus generated are used to induce an encoder-decoder CNN model that learns to compute a segmented 2D projection of the scene, which we finally back project to the 3D point cloud. In addition to a novel method, this article also makes a second major contribution of introducing the enhanced version of our large-scale public PC-Urban outdoor dataset which is captured in a civic setup with an Ouster LiDAR sensor. The updated dataset (PC-Urban_V2) provides nearly 8 billion points including over 100 million points labeled for 25 classes of interest. We provide a thorough evaluation of our technique on PC-Urban_V2 and three other public datasets.

Highlights

  • Semantic segmentation plays an important role in scene understanding

  • We report the mean intersection over union and overall accuracy (OA) in %, for comparison

  • Our method outperformed all other approaches in terms of class-wise accuracy on six classes including bicycle, truck, other-vehicle, bicyclist, other-ground, and fence

Read more

Summary

Introduction

Semantic segmentation plays an important role in scene understanding. Images have been used for this task. Images fail to accurately encode the geometry of real-world scenes. A LiDAR sensor captures precise coordinate information of multiple points in the scene, thereby preserving the 3D geometry. It readily provides depth information which is inherently more suitable for the task of semantic segmentation [1]. 3D point clouds obtained from LiDAR are finding many applications in emerging technologies like remote sensing, site surveying, self-driving cars, and 3D urban environment modeling [2]. Due to the unstructured and sparse nature of point clouds, their accurate semantic segmentation still remains an open research problem

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call