Semantic Context Encoding for Accurate 3D Point Cloud Segmentation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Semantic context plays a significant role in image segmentation. However, few prior works have explored semantic contexts for 3D point cloud segmentation. In this paper, we propose a simple yet effective Point Context Encoding (PointCE) module to capture semantic contexts of a point cloud and adaptively highlight intermediate feature maps. We also introduce a Semantic Context Encoding loss (SCE-loss) to supervise the network to learn rich semantic context features. To avoid hyperparameter tuning and achieve better convergence performance, we further propose a geometric mean loss to integrate both SCE-loss and segmentation loss. Our PointCE module is general and lightweight, and can be integrated into any point cloud segmentation architecture to improve its segmentation performance with only marginal extra overheads. Experimental results on the ScanNet, S3DIS and Semantic3D datasets show that consistent and significant improvement can be achieved for several different networks by integrating our PointCE module.

Similar Papers
  • Research Article
  • Cite Count Icon 1
  • 10.3390/app14051777
Semantic Segmentation of 3D Point Clouds in Outdoor Environments Based on Local Dual-Enhancement
  • Feb 22, 2024
  • Applied Sciences
  • Kai Zhang + 3 more

Semantic segmentation of 3D point clouds in drivable areas is very important for unmanned vehicles. Due to the imbalance between the size of various outdoor scene objects and the sample size, the object boundaries are not clear, and small sample features cannot be extracted. As a result, the semantic segmentation accuracy of 3D point clouds in outdoor environment is not high. To solve these problems, we propose a local dual-enhancement network (LDE-Net) for semantic segmentation of 3D point clouds in outdoor environments for unmanned vehicles. The network is composed of local-global feature extraction modules, and a local feature aggregation classifier. The local-global feature extraction module captures both local and global features, which can improve the accuracy and robustness of semantic segmentation. The local feature aggregation classifier considers the feature information of neighboring points to ensure clarity of object boundaries and the high overall accuracy of semantic segmentation. Experimental results show that provides clearer boundaries between various objects, and has higher identification accuracy for small sample objects. The LDE-Net has good performance for semantic segmentation of 3D point clouds in outdoor environments.

  • Research Article
  • Cite Count Icon 24
  • 10.3390/rs14143415
A Prior Level Fusion Approach for the Semantic Segmentation of 3D Point Clouds Using Deep Learning
  • Jul 16, 2022
  • Remote Sensing
  • Zouhair Ballouch + 4 more

Three-dimensional digital models play a pivotal role in city planning, monitoring, and sustainable management of smart and Digital Twin Cities (DTCs). In this context, semantic segmentation of airborne 3D point clouds is crucial for modeling, simulating, and understanding large-scale urban environments. Previous research studies have demonstrated that the performance of 3D semantic segmentation can be improved by fusing 3D point clouds and other data sources. In this paper, a new prior-level fusion approach is proposed for semantic segmentation of large-scale urban areas using optical images and point clouds. The proposed approach uses image classification obtained by the Maximum Likelihood Classifier as the prior knowledge for 3D semantic segmentation. Afterwards, the raster values from classified images are assigned to Lidar point clouds at the data preparation step. Finally, an advanced Deep Learning model (RandLaNet) is adopted to perform the 3D semantic segmentation. The results show that the proposed approach provides good results in terms of both evaluation metrics and visual examination with a higher Intersection over Union (96%) on the created dataset, compared with (92%) for the non-fusion approach.

  • Conference Article
  • 10.1109/dasc-picom-cbdcom-cyberscitech49142.2020.00034
Segmentation of 3D Point Clouds for Weak Texture Ground Plane
  • Aug 1, 2020
  • Ming-Can Geng + 3 more

The segmentation of 3D point clouds for ground plane can generate drivable area for robots' autonomous navigation. And Compared with lasers for generating 3D point clouds, cameras can provide more information and have higher scalability. However, in the process of using the camera to generate 3d point clouds of the ground, the ground lacks texture. Therefore, the ground is often be lacked in the 3D point clouds. A segmentation of 3D point clouds for weak texture ground plane method is proposed in this paper. Firstly, point cloud pretreatment process is designed by using down sampling methods. Secondly, Euclidean-clustering algorithm is used for segment of the point clouds. Thirdly, vertical projection and plane fitting based on RANSAC algorithm are proposed. Fourthly, feasible region refers to the area which the robot can safely pass is segmented. Finally, the method that proposed in this paper is verified using experiments.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.5194/isprs-archives-xlii-2-w13-785-2019
SEMANTIC SEGMENTATION OF INDOOR 3D POINT CLOUD WITH SLENET
  • Jun 5, 2019
  • The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • Y Ding + 3 more

Abstract. With the rapid development of new indoor sensors and acquisition techniques, the amount of indoor three dimensional (3D) point cloud models was significantly increased. However, these massive “blind” point clouds are difficult to satisfy the demand of many location-based indoor applications and GIS analysis. The robust semantic segmentation of 3D point clouds remains a challenge. In this paper, a segmentation with layout estimation network (SLENet)-based 2D–3D semantic transfer method is proposed for robust segmentation of image-based indoor 3D point clouds. Firstly, a SLENet is devised to simultaneously achieve the semantic labels and indoor spatial layout estimation from 2D images. A pixel labeling pool is then constructed to incorporate the visual graphical model to realize the efficient 2D–3D semantic transfer for 3D point clouds, which avoids the time-consuming pixel-wise label transfer and the reprojection error. Finally, a 3D-contextual refinement, which explores the extra-image consistency with 3D constraints is developed to suppress the labeling contradiction caused by multi-superpixel aggregation. The experiments were conducted on an open dataset (NYUDv2 indoor dataset) and a local dataset. In comparison with the state-of-the-art methods in terms of 2D semantic segmentation, SLENet can both learn discriminative enough features for inter-class segmentation while preserving clear boundaries for intra-class segmentation. Based on the excellence of SLENet, the final 3D semantic segmentation tested on the point cloud created from the local image dataset can reach a total accuracy of 89.97%, with the object semantics and indoor structural information both expressed.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 30
  • 10.3390/electronics12132829
Semantic Segmentation of Transmission Corridor 3D Point Clouds Based on CA-PointNet++
  • Jun 26, 2023
  • Electronics
  • Guanjian Wang + 4 more

Automated extraction of key points from three-dimensional (3D) point clouds in transmission corridors provides technical support for digital twin construction and risk management of the power grid. However, accurately and efficiently segmenting the point clouds of transmission corridors remains a challenging problem. Traditional segmentation methods for transmission corridors suffer from low accuracy and poor generalization ability, and the potential of deep learning in this field has been overlooked. Therefore, the PointNet++ deep learning model is employed as the backbone network for the segmentation of 3D point clouds in transmission corridors. Additionally, given the distinct distribution of key components, an end-to-end CA-PointNet++ architecture is proposed by integrating the Coordinate Attention (CA) module with PointNet++. This approach captures long-distance spatial contextual features and improves feature saliency for more precise segmentation. Furthermore, CA-PointNet++ is evaluated on a dataset of 3D point clouds collected by unmanned aerial vehicles (UAV) equipped with Light Detection and Ranging (LiDAR) for inspecting transmission corridors. The results show that CA-PointNet++ achieved 93.7% overall accuracy (OA) and 67.4% mean Intersection over Union (mIoU). Comparative studies with established deep learning models confirm that our proposed CA-PointNet++ exhibits high accuracy and strong generalization ability for point cloud segmentation tasks in transmission corridors.

  • Conference Article
  • 10.1109/ipec54454.2022.9777336
GFFS: Gravitational-Force Feature Real Time Segmentation of 3D Point Cloud
  • Apr 14, 2022
  • Chunhao Shi + 6 more

Segmentation of point cloud is a momentous technology for autonomous vehicles. Due to the diversity and complexity of road environment, instance segmentation based on point cloud is a challenging task. Addressing this challenge, a gravitational-force feature segmentation of 3D point cloud (GFFS) is proposed. Firstly it divides the point into areas, secondly it sets the radius of the spherical neighborhood corresponding to each area based on the size and density of point cloud, thirdly it calculates the gravitational forces between the space of each spherical neighborhood and the barycenter of the area. GFFS performs instance segmentation of all kinds of objects in point cloud scene by analyzing the magnitude and continuity of gravitational forces. Finally, the superiority and effectiveness of the algorithm are verified by experiments.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/eusipco.2016.7760379
3D point cloud segmentation oriented to the analysis of interactions
  • Aug 1, 2016
  • Xiao Lin + 2 more

Given the widespread availability of point cloud data from consumer depth sensors, 3D point cloud segmentation becomes a promising building block for high level applications such as scene understanding and interaction analysis. It benefits from the richer information contained in real world 3D data compared to 2D images. This also implies that the classical color segmentation challenges have shifted to RGBD data, and new challenges have also emerged as the depth information is usually noisy, sparse and unorganized. Meanwhile, the lack of 3D point cloud ground truth labeling also limits the development and comparison among methods in 3D point cloud segmentation. In this paper, we present two contributions: a novel graph based point cloud segmentation method for RGBD stream data with interacting objects and a new ground truth labeling for a previously published data set [1]. This data set focuses on interaction (merge and split between ‘object’ point clouds), which differentiates itself from the few existing labeled RGBD data sets which are more oriented to Simultaneous Localization And Mapping (SLAM) tasks. The proposed point cloud segmentation method is evaluated with the 3D point cloud ground truth labeling. Experiments show the promising result of our approach.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tvcg.2024.3376951
Viewpoint Recommendation for Point Cloud Labeling Through Interaction Cost Modeling.
  • Jan 1, 2024
  • IEEE transactions on visualization and computer graphics
  • Yu Zhang + 3 more

Semantic segmentation of 3D point clouds is important for many applications, such as autonomous driving. To train semantic segmentation models, labeled point cloud segmentation datasets are essential. Meanwhile, point cloud labeling is time-consuming for annotators, which typically involves tuning the camera viewpoint and selecting points by lasso. To reduce the time cost of point cloud labeling, we propose a viewpoint recommendation approach to reduce annotators' labeling time costs. We adapt Fitts' law to model the time cost of lasso selection in point clouds. Using the modeled time cost, the viewpoint that minimizes the lasso selection time cost is recommended to the annotator. We build a data labeling system for semantic segmentation of 3D point clouds that integrates our viewpoint recommendation approach. The system enables users to navigate to recommended viewpoints for efficient annotation. Through an ablation study, we observed that our approach effectively reduced the data labeling time cost. We also qualitatively compare our approach with previous viewpoint selection approaches on different datasets.

  • Conference Article
  • Cite Count Icon 7
  • 10.24963/ijcai.2022/168
TopoSeg: Topology-aware Segmentation for Point Clouds
  • Jul 1, 2022
  • Weiquan Liu + 5 more

Point cloud segmentation plays an important role in AI applications such as autonomous driving, AR, and VR. However, previous point cloud segmentation neural networks rarely pay attention to the topological correctness of the segmentation results. In this paper, focusing on the perspective of topology awareness. First, to optimize the distribution of segmented predictions from the perspective of topology, we introduce the persistent homology theory in topology into a 3D point cloud deep learning framework. Second, we propose a topology-aware 3D point cloud segmentation module, TopoSeg. Specifically, we design a topological loss function embedded in TopoSeg module, which imposes topological constraints on the segmentation of 3D point clouds. Experiments show that our proposed TopoSeg module can be easily embedded into the point cloud segmentation network and improve the segmentation performance. In addition, based on the constructed topology loss function, we propose a topology-aware point cloud edge extraction algorithm, which is demonstrated that has strong robustness.

  • Conference Article
  • Cite Count Icon 449
  • 10.1109/ivs.2010.5548059
Fast segmentation of 3D point clouds for ground vehicles
  • Jun 1, 2010
  • M Himmelsbach + 2 more

This paper describes a fast method for segmentation of large-size long-range 3D point clouds that especially lends itself for later classification of objects. Our approach is targeted at high-speed autonomous ground robot mobility, so real-time performance of the segmentation method plays a critical role. This is especially true as segmentation is considered only a necessary preliminary for the more important task of object classification that is itself computationally very demanding. Efficiency is achieved in our approach by splitting the segmentation problem into two simpler subproblems of lower complexity: local ground plane estimation followed by fast 2D connected components labeling. The method's performance is evaluated on real data acquired in different outdoor scenes, and the results are compared to those of existing methods. We show that our method requires less runtime while at the same time yielding segmentation results that are better suited for later classification of the identified objects.

  • Research Article
  • Cite Count Icon 3
  • 10.1007/s11063-020-10368-8
DCARN: Deep Context Aware Recurrent Neural Network for Semantic Segmentation of Large Scale Unstructured 3D Point Cloud
  • Oct 17, 2020
  • Neural Processing Letters
  • Saba Mehmood + 2 more

Semantic segmentation of large unstructured 3D point clouds is important problem for 3D object recognition which in turn is essential to solving more complex tasks such as scene understanding. The problem is highly challenging owing to large scale of data, varying point density and localization errors of 3D points. Nevertheless, with recent successes of deep neural network architectures to solve complex 2D perceptual problems, several researchers have shown interest to translate the developed 2D networks to 3D point cloud segmentation by a prior voxelization step for an explicit neighborhood representation. However, such a 3D grid representation loses the fine details and inherent structure due to quantization artifacts. For this purpose, this paper proposes an approach to performing semantic segmentation of 3D point clouds by exploiting the idea of super-point based graph construction. The proposed architecture is composed of two cascaded modules including a light-weight representation learning module which uses unsupervised geometric grouping to partition the large-scale unstructured 3D point cloud and a deep context aware sequential network based on long short memory units and graph convolutions with embedding residual learning for semantic segmentation. The proposed model is evaluated on two standard benchmark datasets and achieves competitive performance with the existing state-of-the-art datasets. The code and the obtained results have been made public at https://github.com/saba155/DCARN .

  • Book Chapter
  • Cite Count Icon 63
  • 10.1007/978-3-642-28572-1_40
A Pipeline for the Segmentation and Classification of 3D Point Clouds
  • Jan 1, 2014
  • B Douillard + 4 more

This paper presents algorithms for fast segmentation of 3D point clouds and subsequent classification of the obtained 3D segments. The method jointly determines the ground surface and segments individual objects in 3D, including overhanging structures. When compared to six other terrain modelling techniques, this approach has minimal error between the sensed data and the representation; and is fast (processing a Velodyne scan in approximately 2 seconds). Applications include improved alignment of successive scans by enabling operations in sections (Velodyne scans are aligned 7% sharper compared to an approach using raw points) and more informed decision-making (paths move around overhangs). The use of segmentation to aid classification through 3D features, such as the Spin Image or the Spherical Harmonic Descriptor, is discussed and experimentally compared. Moreover, the segmentation facilitates a novel approach to 3D classification that bypasses feature extraction and directly compares 3D shapes via the ICP algorithm. This technique is shown to achieve accuracy on par with the best feature based classifier (92.1%) while being significantly faster and allowing a clearer understanding of the classifier’s behaviour.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tvcg.2024.3484654
Visual Boundary-Guided Pseudo-Labeling for Weakly Supervised 3D Point Cloud Segmentation in Indoor Environments.
  • Sep 1, 2025
  • IEEE transactions on visualization and computer graphics
  • Zhuo Su + 4 more

Accurate segmentation of 3D point clouds in indoor scenes remains a challenging task, often hindered by the labor-intensive nature of data annotation. While weakly supervised learning approaches have shown promise in leveraging partial annotations, they frequently struggle with imbalanced performance between foreground and background elements due to the complex structures and proximity of objects in indoor environments. To address this issue, we propose a novel foreground-aware label enhancement method utilizing visual boundary priors. Our approach projects 3D point clouds onto 2D planes and applies 2D image segmentation to generate pseudo-labels for foreground objects. These labels are subsequently back-projected into 3D space and used to train an initial segmentation model. We further refine this process by incorporating prior knowledge from projected images to filter the predicted labels, followed by model retraining. We introduce this technique as the Foreground Boundary Prior (FBP), a versatile, plug-and-play module designed to enhance various weakly supervised point cloud segmentation methods. We demonstrate the efficacy of our approach on the widely-used 2D-3D-Semantic dataset, employing both random-sample and bounding-box based weak labeling strategies. Our experimental results show significant improvements in segmentation performance across different architectural backbones, highlighting the method's effectiveness and portability.

  • Conference Article
  • Cite Count Icon 13
  • 10.1109/apmar.2019.8709156
Semantic Segmentation of 3D Point Cloud to Virtually Manipulate Real Living Space
  • Mar 1, 2019
  • Yuki Ishikawa + 5 more

This paper presents a method for the virtual manipulation of real living space using semantic segmentation of a 3D point cloud captured in the real world. We applied PointNet to segment each piece of furniture from the point cloud of a real indoor environment captured by moving a RGB-D camera. For semantic segmentation, we focused on local geometric information not used in PointNet, and we proposed a method to refine the class probability of labels attached to each point in PointNet’s output. The effectiveness of our method was experimentally confirmed. We then created 3D models of real-world furniture using a point cloud with corrected labels, and we virtually manipulated real living space using Dollhouse VR, a layout system.

  • Research Article
  • Cite Count Icon 11
  • 10.1016/j.patcog.2024.111014
VPA-Net: A visual perception assistance network for 3d lidar semantic segmentation
  • Sep 20, 2024
  • Pattern Recognition
  • Fangfang Lin + 5 more

VPA-Net: A visual perception assistance network for 3d lidar semantic segmentation

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant