An Efficient Ensemble Deep Learning Approach for Semantic Point Cloud Segmentation Based on 3D Geometric Features and Range Images

  • Abstract
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Mobile light detection and ranging (LiDAR) sensor point clouds are used in many fields such as road network management, architecture and urban planning, and 3D High Definition (HD) city maps for autonomous vehicles. Semantic segmentation of mobile point clouds is critical for these tasks. In this study, we present a robust and effective deep learning-based point cloud semantic segmentation method. Semantic segmentation is applied to range images produced from point cloud with spherical projection. Irregular 3D mobile point clouds are transformed into regular form by projecting the clouds onto the plane to generate 2D representation of the point cloud. This representation is fed to the proposed network that produces semantic segmentation. The local geometric feature vector is calculated for each point. Optimum parameter experiments were also performed to obtain the best results for semantic segmentation. The proposed technique, called SegUNet3D, is an ensemble approach based on the combination of U-Net and SegNet algorithms. SegUNet3D algorithm has been compared with five different segmentation algorithms on two challenging datasets. SemanticPOSS dataset includes the urban area, whereas RELLIS-3D includes the off-road environment. As a result of the study, it was demonstrated that the proposed approach is superior to other methods in terms of mean Intersection over Union (mIoU) in both datasets. The proposed method was able to improve the mIoU metric by up to 15.9% in the SemanticPOSS dataset and up to 5.4% in the RELLIS-3D dataset.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 7
  • 10.3390/electronics11203310
Selection of Relevant Geometric Features Using Filter-Based Algorithms for Point Cloud Semantic Segmentation
  • Oct 14, 2022
  • Electronics
  • Muhammed Enes Atik + 1 more

Semantic segmentation of mobile LiDAR point clouds is an essential task in many fields such as road network management, mapping, urban planning, and 3D High Definition (HD) city maps for autonomous vehicles. This study presents an approach to improve the evaluation metrics of deep-learning-based point cloud semantic segmentation using 3D geometric features and filter-based feature selection. Information gain (IG), Chi-square (Chi2), and ReliefF algorithms are used to select relevant features. RandLA-Net and Superpoint Grapgh (SPG), the current and effective deep learning networks, were preferred for applying semantic segmentation. RandLA-Net and SPG were fed by adding geometric features in addition to 3D coordinates (x, y, z) directly without any change in the structure of the point clouds. Experiments were carried out on three challenging mobile LiDAR datasets: Toronto3D, SZTAKI-CityMLS, and Paris. As a result of the study, it was demonstrated that the selection of relevant features improved accuracy in all datasets. For RandLA-Net, mean Intersection-over-Union (mIoU) was 70.1% with the features selected with Chi2 in the Toronto3D dataset, 84.1% mIoU was obtained with the features selected with the IG in the SZTAKI-CityMLS dataset, and 55.2% mIoU with the features selected with the IG and ReliefF in the Paris dataset. For SPG, 69.8% mIoU was obtained with Chi2 in the Toronto3D dataset, 77.5% mIoU was obtained with IG in SZTAKI-CityMLS, and 59.0% mIoU was obtained with IG and ReliefF in Paris.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/iros47612.2022.9982099
Low-Latency LiDAR Semantic Segmentation
  • Oct 23, 2022
  • Takahiro Hori + 1 more

Several methods of semantic segmentation using light detection and ranging (LiDAR) sensors have been proposed for the recognition of surrounding objects by autonomous driving cars. LiDAR is a sensor that compensates for the weaknesses of other sensors, such as cameras or radar systems, and semantic segmentation assigns a class label to each point in the LiDAR point cloud. Recently, real-time semantic segmentation methods that are capable of processing LiDAR point clouds at frame rates have been proposed. Real-time semantic segmentation is essential for the autonomous driving system because it can output class labels for LiDAR point clouds at high speeds. However, this segmentation method suffers from a delay equal to processing time. To address this challenge, we propose a novel method that combines SalsaNext [1], a method of real-time LiDAR semantic segmentation, and semantic forecasting, which predicts the results of future semantic segmentation. We quantitatively evaluate our method using the Semantic-KITTI dataset, which comprises point cloud data acquired from the LiDAR sensor in the real world, and compare the latency and accuracy of our method with other semantic segmentation methods. Consequently, our method is found to be capable of operating in real-time and with low-latency, and it can achieve a performance similar to that of previously reported real-time semantic segmentation methods.

  • Research Article
  • Cite Count Icon 35
  • 10.1016/j.ophoto.2021.100011
Semantic segmentation of point cloud data using raw laser scanner measurements and deep neural networks
  • Dec 16, 2021
  • ISPRS Open Journal of Photogrammetry and Remote Sensing
  • Risto Kaijaluoto + 4 more

Deep learning methods based on convolutional neural networks have shown to give excellent results in semantic segmentation of images, but the inherent irregularity of point cloud data complicates their usage in semantically segmenting 3D laser scanning data. To overcome this problem, point cloud networks particularly specialized for the purpose have been implemented since 2017 but finding the most appropriate way to semantically segment point clouds is still an open research question. In this study we attempted semantic segmentation of point cloud data with convolutional neural networks by using only the raw measurements provided by a multiple echo detection capable profiling laser scanner. We formatted the measurements to a series of 2D rasters, where each raster contains the measurements (range, reflectance, echo deviation) of a single scanner mirror rotation to be able to use the rich research done on semantic segmentation of 2D images with convolutional neural networks. Similar approach for profiling laser scanner in forest context has never been proposed before. A boreal forest in Evo region near Hämeenlinna in Finland was used as experimental study area. The data was collected with FGI Akhka-R3 backpack laser scanning system, georeferenced and then manually labelled to ground, understorey, tree trunk and foliage classes for training and evaluation purposes. The labelled points were then transformed back to 2D rasters and used for training three different neural network architectures. Further, the same georeferenced data in point cloud format was used for training the state-of-the-art point cloud semantic segmentation network RandLA-Net and the results were compared with those of our method. Our best semantic segmentation network reached the mean Intersection-over-Union value of 80.1% and it is comparable to the 80.6% reached by the point cloud -based RandLA-Net. The numerical results and visual analysis of the resulting point clouds show that our method is a valid way of doing semantic segmentation of point clouds at least in the forest context. The labelled datasets were also released to the research community.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 30
  • 10.3390/rs15030576
Framework for Geometric Information Extraction and Digital Modeling from LiDAR Data of Road Scenarios
  • Jan 18, 2023
  • Remote Sensing
  • Yuchen Wang + 6 more

Road geometric information and a digital model based on light detection and ranging (LiDAR) can perform accurate geometric inventories and three-dimensional (3D) descriptions for as-built roads and infrastructures. However, unorganized point clouds and complex road scenarios would reduce the accuracy of geometric information extraction and digital modeling. There is a standardization need for information extraction and 3D model construction that integrates point cloud processing and digital modeling. This paper develops a framework from semantic segmentation to geometric information extraction and digital modeling based on LiDAR data. A semantic segmentation network is improved for the purpose of dividing the road surface and infrastructure. The road boundary and centerline are extracted by the alpha-shape and Voronoi diagram methods based on the semantic segmentation results. The road geometric information is obtained by a coordinate transformation matrix and the least square method. Subsequently, adaptive road components are constructed using Revit software. Thereafter, the road route, road entity model, and various infrastructure components are generated by the extracted geometric information through Dynamo and Revit software. Finally, a detailed digital model of the road scenario is developed. The Toronto-3D and Semantic3D datasets are utilized for analysis through training and testing. The overall accuracy (OA) of the proposed net for the two datasets is 95.3 and 95.0%, whereas the IoU of segmented road surfaces is 95.7 and 97.9%. This indicates that the proposed net could accomplish superior performance for semantic segmentation of point clouds. The mean absolute errors between the extracted and manually measured geometric information are marginal. This demonstrates the effectiveness and accuracy of the proposed extraction methods. Thus, the proposed framework could provide a reference for accurate extraction and modeling from LiDAR data.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 17
  • 10.3390/rs14122768
Construction of a Semantic Segmentation Network for the Overhead Catenary System Point Cloud Based on Multi-Scale Feature Fusion
  • Jun 9, 2022
  • Remote Sensing
  • Tao Xu + 5 more

Accurate semantic segmentation results of the overhead catenary system (OCS) are significant for OCS component extraction and geometric parameter detection. Actually, the scenes of OCS are complex, and the density of point cloud data obtained through Light Detection and Ranging (LiDAR) scanning is uneven due to the character difference of OCS components. However, due to the inconsistent component points, it is challenging to complete better semantic segmentation of the OCS point cloud with the existing deep learning methods. Therefore, this paper proposes a point cloud multi-scale feature fusion refinement structure neural network (PMFR-Net) for semantic segmentation of the OCS point cloud. The PMFR-Net includes a prediction module and a refinement module. The innovations of the prediction module include the double efficient channel attention module (DECA) and the serial hybrid domain attention (SHDA) structure. The point cloud refinement module (PCRM) is used as the refinement module of the network. DECA focuses on detail features; SHDA strengthens the connection of contextual semantic information; PCRM further refines the segmentation results of the prediction module. In addition, this paper created and released a new dataset of the OCS point cloud. Based on this dataset, the overall accuracy (OA), F1-score, and mean intersection over union (MIoU) of PMFR-Net reached 95.77%, 93.24%, and 87.62%, respectively. Compared with four state-of-the-art (SOTA) point cloud deep learning methods, the comparative experimental results showed that PMFR-Net achieved the highest accuracy and the shortest training time. At the same time, PMFR-Net segmentation performance on S3DIS public dataset is better than the other four SOTA segmentation methods. In addition, the effectiveness of DECA, SHDA structure, and PCRM was verified in the ablation experiment. The experimental results show that this network could be applied to practical applications.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 18
  • 10.3390/rs12223830
Slice-Based Instance and Semantic Segmentation for Low-Channel Roadside LiDAR Data
  • Nov 21, 2020
  • Remote Sensing
  • Hui Liu + 3 more

More and more scholars are committed to light detection and ranging (LiDAR) as a roadside sensor to obtain traffic flow data. Filtering and clustering are common methods to extract pedestrians and vehicles from point clouds. This kind of method ignores the impact of environmental information on traffic. The segmentation process is a crucial part of detailed scene understanding, which could be especially helpful for locating, recognizing, and classifying objects in certain scenarios. However, there are few studies on the segmentation of low-channel (16 channels in this paper) roadside 3D LiDAR. This paper presents a novel segmentation (slice-based) method for point clouds of roadside LiDAR. The proposed method can be divided into two parts: the instance segmentation part and semantic segmentation part. The part of the instance segmentation of point cloud is based on the regional growth method, and we proposed a seed point generation method for low-channel LiDAR data. Furthermore, we optimized the instance segmentation effect under occlusion. The part of semantic segmentation of a point cloud is realized by classifying and labeling the objects obtained by instance segmentation. For labeling static objects, we represented and classified a certain object through the related features derived from its slices. For labeling moving objects, we proposed a recurrent neural network (RNN)-based model, of which the accuracy could be up to 98.7%. The result implies that the slice-based method can obtain a good segmentation effect and the slice has good potential for point cloud segmentation.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 34
  • 10.1109/access.2020.2992612
CSPC-Dataset: New LiDAR Point Cloud Dataset and Benchmark for Large-Scale Scene Semantic Segmentation
  • Jan 1, 2020
  • IEEE Access
  • Guofeng Tong + 5 more

Large-scale point clouds scanned by light detection and ranging (lidar) sensors provide detailed geometric characteristics of scenes due to the provision of 3D structural data. The semantic segmentation of large-scale point clouds is a crucial step for an in-depth understanding of complex scenes. Of late, although a large number of point cloud semantic segmentation algorithms have been proposed, semantic segmentation methods are still far from being satisfactory in terms of precision and accuracy of large-scale point clouds. For machine learning (ML) and deep learning (DL) methodologies, the semantic segmentation is largely influenced by the quality of training sets and methods themselves. Therefore, we construct a new point cloud dataset, namely CSPC-Dataset (Complex Scene Point Cloud Dataset) for large-scale scene semantic segmentation. CSPC-Dataset point clouds are acquired by a wearable laser mobile mapping robot. It covers five complex urban and rural scenes and mainly includes six types of objects, i.e., ground, car, building, vegetation, bridge, and pole. It provides large-scale outdoor scenes with color information, which has advantages such as the scene more complete, point density relatively uniform, diversity and complexity of objects and the high discrepancy between different scenes. Based on the CSPC-Dataset, we construct a new benchmark, which includes approximately 68 million points with explicit semantic labels. To extend the dataset into a wide range of applications, this paper provides the semantic segmentation results and comparative analysis of 7 baseline methods based on CSPC-Dataset. In the experiment part, three groups of experiments are conducted for benchmarking, which offers an effective way to make comparisons with different point-labeling algorithms. The labeling results have shown that the highest Intersection over Union (IoU) of pole, ground, building, car, vegetation, and bridge for all benchmarks is 36.0%, 97.8%, 93.7%, 65.6%, 92.0%, and 69.6%.

  • Research Article
  • Cite Count Icon 8
  • 10.1016/j.jag.2022.103027
Extraction of local structure information of point clouds through space-filling curve for semantic segmentation
  • Nov 1, 2022
  • International Journal of Applied Earth Observation and Geoinformation
  • Xueyong Xiang + 3 more

Extraction of local structure information of point clouds through space-filling curve for semantic segmentation

  • Research Article
  • Cite Count Icon 11
  • 10.1145/3649442
Machine and Deep Learning Implementations for Heritage Building Information Modelling: A Critical Review of Theoretical and Applied Research
  • Apr 23, 2024
  • Journal on Computing and Cultural Heritage
  • Aleksander Gil + 3 more

Research domain and Problem: HBIM modelling from point cloud data has become a crucial research topic in the last decade since it is potentially considered as the central data model paving the way for the digital heritage practice beyond digitization. Reality Capture technologies such as terrestrial laser scanning, drone-mounted LiDAR sensors and photogrammetry enable the reality capture with a sub-millimetre accurate point cloud file that can be used as a reference file for Heritage Building Information Modelling (HBIM). However, HBIM modelling from the point cloud data of heritage buildings is mainly manual, error-prone, and time-consuming. Furthermore, image processing techniques are insufficient for classification and segmentation of point cloud data to speed up and enhance the current workflow for HBIM modelling. Due to the challenges and bottlenecks in the scan-to-HBIM process, which is commonly criticized as complex with its bespoke requirements, semantic segmentation of point clouds is gaining popularity in the literature. Research Aim and Methodology: Therefore, this paper aims to provide a thorough critical review of Machine Learning and Deep Learning methods for point cloud segmentation, classification, and BIM geometry automation for cultural heritage case study applications. Research findings: This paper files the challenges of HBIM practice and the opportunities for semantic point cloud segmentation found across academic literature in the last decade. Beyond definitions and basic occurrence statistics, this paper discusses the success rates and implementation challenges of machine and deep learning classification methods. Research value and contribution: This paper provides a holistic review of point cloud segmentation and its potential for further development and application in the Cultural Heritage sector. The critical analysis provides insight into the current state-of-the-art methods and advises on their suitability for HBIM projects. The review has identified highly original threads of research, which hold the potential to significantly influence practice and further applied research.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 23
  • 10.3390/rs15092371
Deep-Learning-Based Semantic Segmentation Approach for Point Clouds of Extra-High-Voltage Transmission Lines
  • Apr 30, 2023
  • Remote Sensing
  • Hao Yu + 9 more

The accurate semantic segmentation of point cloud data is the basis for their application in the inspection of extra high-voltage transmission lines (EHVTL). As deep learning evolves, point-wise-based deep neural networks have shown great potential for the semantic segmentation of EHVTL point clouds. However, EHVTL point cloud data are characterized by a large data volume and significant class imbalance. Therefore, the down-sampling method and point cloud feature extraction method used in current point-wise-based deep neural networks hardly meet the needs of computational accuracy and efficiency. In this paper, we proposed a two-step down-sampling method and a point cloud feature extraction method based on local feature aggregation of the point clouds after down-sampling in each layer of the model (LFAPAD). We then established a deep neural network named PowerLine-Net for the semantic segmentation of the EHVTL point clouds. Furthermore, in order to test and analyze the performance of PowerLine-Net, we constructed a point cloud dataset for the EHVTL scenes. Using this dataset and the Semantic3D dataset, we implemented network parameter testing, semantic segmentation, and an accuracy comparison of different networks based on PowerLine-Net. The results illustrate that the semantic segmentation model proposed in this paper has a high computational efficiency and accuracy in the semantic segmentation of EHVTL point clouds. Compared with conventional deep neural networks, including PointCNN, KPConv, SPG, PointNet++, and RandLA-Net, PowerLine-Net also achieves a higher accuracy in the semantic segmentation of EHVTL point clouds. Moreover, based on the results predicted by PowerLine-Net, the risk point detection for EHVTL point clouds has been achieved, which demonstrates the important value of this network in practical applications. In addition, as shown by the results of Semantic3D, PowerLine-Net also achieves a high segmentation accuracy, which proves its powerful capability and wide applicability in semantic segmentation for the point clouds of large-scale scenes.

  • Research Article
  • Cite Count Icon 86
  • 10.1016/j.eswa.2022.118815
PCSCNet: Fast 3D semantic segmentation of LiDAR point cloud for autonomous car using point convolution and sparse convolution network
  • Sep 13, 2022
  • Expert Systems with Applications
  • Jaehyun Park + 3 more

PCSCNet: Fast 3D semantic segmentation of LiDAR point cloud for autonomous car using point convolution and sparse convolution network

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.jag.2022.102974
A self-attention based global feature enhancing network for semantic segmentation of large-scale urban street-level point clouds
  • Sep 1, 2022
  • International Journal of Applied Earth Observation and Geoinformation
  • Qi Chen + 5 more

A self-attention based global feature enhancing network for semantic segmentation of large-scale urban street-level point clouds

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 26
  • 10.3390/rs14153842
A Deep Learning-Based Method for Extracting Standing Wood Feature Parameters from Terrestrial Laser Scanning Point Clouds of Artificially Planted Forest
  • Aug 8, 2022
  • Remote Sensing
  • Xingyu Shen + 4 more

The use of 3D point cloud-based technology for quantifying standing wood and stand parameters can play a key role in forestry ecological benefit assessment and standing tree cultivation and utilization. With the advance of 3D information acquisition techniques, such as light detection and ranging (LiDAR) scanning, the stand information of trees in large areas and complex terrain can be obtained more efficiently. However, due to the diversity of the forest floor, the morphological diversity of the trees, and the fact that forestry is often planted as large-scale plantations, efficiently segmenting the point cloud of artificially planted forests and extracting standing wood feature parameters remains a considerable challenge. An effective method based on energy segmentation and PointCNN is proposed in this work to address this issue. The network is enhanced for learning point cloud features by geometric feature balance model (GFBM), enabling the efficient segmentation of tree point clouds from forestry point cloud data collected by terrestrial laser scanning (TLS) in outdoor environments. The 3D Forest software is then used to obtain single wood point cloud after semantic segmentation, and the extracted single wood point cloud is finally employed to extract standing wood feature parameters using TreeQSM. The point cloud semantic segmentation method is the most important part of our research. According to our findings, this method can segment datasets of two different artificially planted woodland point clouds with an overall accuracy of 0.95 and a tree segmentation accuracy of 0.93. When compared with the manual measurements, the root-mean-square error (RMSE) for tree height in the two datasets are 0.30272 and 0.21015 m, and the RMSEs for the diameter at breast height are 0.01436 and 0.01222 m, respectively. Our method is a robust framework based on deep learning that is applicable to forestry for extracting the feature parameters of artificially planted trees. It solves the problem of segmenting tree point clouds in artificially planted trees and provides a reliable data processing method for tree information extraction, trunk shape analysis, etc.

  • Research Article
  • Cite Count Icon 6
  • 10.1109/jstars.2023.3264240
A Category-Contrastive Guided-Graph Convolutional Network Approach for the Semantic Segmentation of Point Clouds
  • Jan 1, 2023
  • IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
  • Xuzhe Wang + 5 more

The semantic segmentation of light detection and ranging (LiDAR) point clouds plays an important role in 3-D scene intelligent perception and semantic modeling. The unstructured, sparse and uneven characteristics of point clouds pose great challenges to the representation of the local geometric shapes, which degrades semantic segmentation performance. To address the challenges of describing local geometric shapes due to unstructured and sparse 3-D point clouds, this article proposes a category-contrastive-guided graph convolutional network (CGGC-Net) for the semantic segmentation of LiDAR point clouds. First, a detailed geometric structure of the raw point clouds is encoded to represent the inherent geometric pattern within the local neighborhood. At the same time, the geometric structures information is transmitted across multiple layers, so that the geometric structure encoding information containing different receptive fields and richer neighborhood spatial structure can be aggregated. Following this, the graph convolution neural network uses the edge convolution layer to adaptively describe the semantic correlation between the query point and its neighboring points, and combines the attention mechanism to gather the surrounding feature information to the query point. As a result, the graph convolution neural network and attention mechanism are iteratively stacked for the aggregation and fusion of spatial context semantic information, to generate highly discriminative semantic feature representation. Finally, the superparameters of the model are learned through a multitask optimization strategy guided by category-aware contrastive loss and cross-entropy loss. Experiments are conducted on the public SemanticKITTI dataset and the Stanford large-scale 3-D Indoor Spaces dataset to demonstrate the effectiveness and reliability of the proposed CGGC-Net from both quantitative and qualitative perspectives. The results indicate its capability of automatically classifying LiDAR point clouds, with a mean intersection-over-union of 58.4%. Moreover, multiple comparative experiments also demonstrate the superior performance of the proposed method, exceeding state-of-the-art methods.

  • Research Article
  • Cite Count Icon 30
  • 10.1016/j.tust.2024.105829
STSD:A large-scale benchmark for semantic segmentation of subway tunnel point cloud
  • May 30, 2024
  • Tunnelling and Underground Space Technology incorporating Trenchless Technology Research
  • Hao Cui + 5 more

STSD:A large-scale benchmark for semantic segmentation of subway tunnel point cloud

Save Icon
Up Arrow
Open/Close
Setting-up Chat
Loading Interface