RESSCAL3D++: Joint Acquisition and Semantic Segmentation of 3D Point Clouds

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

3D scene understanding is crucial for facilitating seamless interaction between digital devices and the physical world. Real-time capturing and processing of the 3D scene are essential for achieving this seamless integration. While existing approaches typically separate acquisition and processing for each frame, the advent of resolution-scalable 3D sensors offers an opportunity to overcome this paradigm and fully leverage the otherwise wasted acquisition time to initiate processing. In this study, we introduce VX-S3DIS, a novel point cloud dataset accurately simulating the behavior of a resolution-scalable 3D sensor. Additionally, we present RESSCAL3D++, an important improvement over our prior work, RESSCAL3D, by incorporating an update module and processing strategy. By applying our method to the new dataset, we practically demonstrate the potential of joint acquisition and semantic segmentation of 3D point clouds. Our resolution-scalable approach significantly reduces scalability costs from 2% to just 0.2% in mIoU while achieving impressive speed-ups of 15.6 to 63.9% compared to the non-scalable baseline. Furthermore, our scalable approach enables early predictions, with the first one occurring after only 7% of the total inference time of the baseline. The new VX-S3DIS dataset is available at https://github.com/remcoroyen/vx-s3dis.

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.5194/isprs-archives-xlii-2-w13-785-2019
SEMANTIC SEGMENTATION OF INDOOR 3D POINT CLOUD WITH SLENET
  • Jun 5, 2019
  • The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • Y Ding + 3 more

Abstract. With the rapid development of new indoor sensors and acquisition techniques, the amount of indoor three dimensional (3D) point cloud models was significantly increased. However, these massive “blind” point clouds are difficult to satisfy the demand of many location-based indoor applications and GIS analysis. The robust semantic segmentation of 3D point clouds remains a challenge. In this paper, a segmentation with layout estimation network (SLENet)-based 2D–3D semantic transfer method is proposed for robust segmentation of image-based indoor 3D point clouds. Firstly, a SLENet is devised to simultaneously achieve the semantic labels and indoor spatial layout estimation from 2D images. A pixel labeling pool is then constructed to incorporate the visual graphical model to realize the efficient 2D–3D semantic transfer for 3D point clouds, which avoids the time-consuming pixel-wise label transfer and the reprojection error. Finally, a 3D-contextual refinement, which explores the extra-image consistency with 3D constraints is developed to suppress the labeling contradiction caused by multi-superpixel aggregation. The experiments were conducted on an open dataset (NYUDv2 indoor dataset) and a local dataset. In comparison with the state-of-the-art methods in terms of 2D semantic segmentation, SLENet can both learn discriminative enough features for inter-class segmentation while preserving clear boundaries for intra-class segmentation. Based on the excellence of SLENet, the final 3D semantic segmentation tested on the point cloud created from the local image dataset can reach a total accuracy of 89.97%, with the object semantics and indoor structural information both expressed.

  • Research Article
  • Cite Count Icon 24
  • 10.3390/rs14143415
A Prior Level Fusion Approach for the Semantic Segmentation of 3D Point Clouds Using Deep Learning
  • Jul 16, 2022
  • Remote Sensing
  • Zouhair Ballouch + 4 more

Three-dimensional digital models play a pivotal role in city planning, monitoring, and sustainable management of smart and Digital Twin Cities (DTCs). In this context, semantic segmentation of airborne 3D point clouds is crucial for modeling, simulating, and understanding large-scale urban environments. Previous research studies have demonstrated that the performance of 3D semantic segmentation can be improved by fusing 3D point clouds and other data sources. In this paper, a new prior-level fusion approach is proposed for semantic segmentation of large-scale urban areas using optical images and point clouds. The proposed approach uses image classification obtained by the Maximum Likelihood Classifier as the prior knowledge for 3D semantic segmentation. Afterwards, the raster values from classified images are assigned to Lidar point clouds at the data preparation step. Finally, an advanced Deep Learning model (RandLaNet) is adopted to perform the 3D semantic segmentation. The results show that the proposed approach provides good results in terms of both evaluation metrics and visual examination with a higher Intersection over Union (96%) on the created dataset, compared with (92%) for the non-fusion approach.

  • Conference Article
  • 10.1109/dasc-picom-cbdcom-cyberscitech49142.2020.00034
Segmentation of 3D Point Clouds for Weak Texture Ground Plane
  • Aug 1, 2020
  • Ming-Can Geng + 3 more

The segmentation of 3D point clouds for ground plane can generate drivable area for robots' autonomous navigation. And Compared with lasers for generating 3D point clouds, cameras can provide more information and have higher scalability. However, in the process of using the camera to generate 3d point clouds of the ground, the ground lacks texture. Therefore, the ground is often be lacked in the 3D point clouds. A segmentation of 3D point clouds for weak texture ground plane method is proposed in this paper. Firstly, point cloud pretreatment process is designed by using down sampling methods. Secondly, Euclidean-clustering algorithm is used for segment of the point clouds. Thirdly, vertical projection and plane fitting based on RANSAC algorithm are proposed. Fourthly, feasible region refers to the area which the robot can safely pass is segmented. Finally, the method that proposed in this paper is verified using experiments.

  • Research Article
  • Cite Count Icon 3
  • 10.1007/s11063-020-10368-8
DCARN: Deep Context Aware Recurrent Neural Network for Semantic Segmentation of Large Scale Unstructured 3D Point Cloud
  • Oct 17, 2020
  • Neural Processing Letters
  • Saba Mehmood + 2 more

Semantic segmentation of large unstructured 3D point clouds is important problem for 3D object recognition which in turn is essential to solving more complex tasks such as scene understanding. The problem is highly challenging owing to large scale of data, varying point density and localization errors of 3D points. Nevertheless, with recent successes of deep neural network architectures to solve complex 2D perceptual problems, several researchers have shown interest to translate the developed 2D networks to 3D point cloud segmentation by a prior voxelization step for an explicit neighborhood representation. However, such a 3D grid representation loses the fine details and inherent structure due to quantization artifacts. For this purpose, this paper proposes an approach to performing semantic segmentation of 3D point clouds by exploiting the idea of super-point based graph construction. The proposed architecture is composed of two cascaded modules including a light-weight representation learning module which uses unsupervised geometric grouping to partition the large-scale unstructured 3D point cloud and a deep context aware sequential network based on long short memory units and graph convolutions with embedding residual learning for semantic segmentation. The proposed model is evaluated on two standard benchmark datasets and achieves competitive performance with the existing state-of-the-art datasets. The code and the obtained results have been made public at https://github.com/saba155/DCARN .

  • Conference Article
  • Cite Count Icon 447
  • 10.1109/ivs.2010.5548059
Fast segmentation of 3D point clouds for ground vehicles
  • Jun 1, 2010
  • M Himmelsbach + 2 more

This paper describes a fast method for segmentation of large-size long-range 3D point clouds that especially lends itself for later classification of objects. Our approach is targeted at high-speed autonomous ground robot mobility, so real-time performance of the segmentation method plays a critical role. This is especially true as segmentation is considered only a necessary preliminary for the more important task of object classification that is itself computationally very demanding. Efficiency is achieved in our approach by splitting the segmentation problem into two simpler subproblems of lower complexity: local ground plane estimation followed by fast 2D connected components labeling. The method's performance is evaluated on real data acquired in different outdoor scenes, and the results are compared to those of existing methods. We show that our method requires less runtime while at the same time yielding segmentation results that are better suited for later classification of the identified objects.

  • Book Chapter
  • Cite Count Icon 63
  • 10.1007/978-3-642-28572-1_40
A Pipeline for the Segmentation and Classification of 3D Point Clouds
  • Jan 1, 2014
  • B Douillard + 4 more

This paper presents algorithms for fast segmentation of 3D point clouds and subsequent classification of the obtained 3D segments. The method jointly determines the ground surface and segments individual objects in 3D, including overhanging structures. When compared to six other terrain modelling techniques, this approach has minimal error between the sensed data and the representation; and is fast (processing a Velodyne scan in approximately 2 seconds). Applications include improved alignment of successive scans by enabling operations in sections (Velodyne scans are aligned 7% sharper compared to an approach using raw points) and more informed decision-making (paths move around overhangs). The use of segmentation to aid classification through 3D features, such as the Spin Image or the Spherical Harmonic Descriptor, is discussed and experimentally compared. Moreover, the segmentation facilitates a novel approach to 3D classification that bypasses feature extraction and directly compares 3D shapes via the ICP algorithm. This technique is shown to achieve accuracy on par with the best feature based classifier (92.1%) while being significantly faster and allowing a clearer understanding of the classifier’s behaviour.

  • Conference Article
  • Cite Count Icon 1
  • 10.23919/ccc52363.2021.9550720
Associate Semantic-Instance Segmentation of 3D Point Clouds Based on Local Feature Extraction
  • Jul 26, 2021
  • Hui Chen + 1 more

Segmentation of 3D point cloud which can express the information of complex scene more accurately is an important basis for 3D scene understanding. However, how to effectively use 3D point cloud information for complex scenes is rarely discussed. This work proposes a two-stage network to achieve semantic segmentation and instance segmentation of point clouds. Specifically, a simple multitasking network is firstly developed by extracting the multi-category features of local point cloud, which can also achieve superior segmentation results. Then, a learnable network is established to make semantic segmentation and instance segmentation mutually promote each other, so as to segment complex scenes more accurately. The validity of this network is proved by experiments and evaluation on S3DIS dataset. Compared to other well-known networks, the proposed two-stage network shows its superiority.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tvcg.2024.3376951
Viewpoint Recommendation for Point Cloud Labeling Through Interaction Cost Modeling.
  • Jan 1, 2024
  • IEEE transactions on visualization and computer graphics
  • Yu Zhang + 3 more

Semantic segmentation of 3D point clouds is important for many applications, such as autonomous driving. To train semantic segmentation models, labeled point cloud segmentation datasets are essential. Meanwhile, point cloud labeling is time-consuming for annotators, which typically involves tuning the camera viewpoint and selecting points by lasso. To reduce the time cost of point cloud labeling, we propose a viewpoint recommendation approach to reduce annotators' labeling time costs. We adapt Fitts' law to model the time cost of lasso selection in point clouds. Using the modeled time cost, the viewpoint that minimizes the lasso selection time cost is recommended to the annotator. We build a data labeling system for semantic segmentation of 3D point clouds that integrates our viewpoint recommendation approach. The system enables users to navigate to recommended viewpoints for efficient annotation. Through an ablation study, we observed that our approach effectively reduced the data labeling time cost. We also qualitatively compare our approach with previous viewpoint selection approaches on different datasets.

  • Conference Article
  • Cite Count Icon 13
  • 10.1109/apmar.2019.8709156
Semantic Segmentation of 3D Point Cloud to Virtually Manipulate Real Living Space
  • Mar 1, 2019
  • Yuki Ishikawa + 5 more

This paper presents a method for the virtual manipulation of real living space using semantic segmentation of a 3D point cloud captured in the real world. We applied PointNet to segment each piece of furniture from the point cloud of a real indoor environment captured by moving a RGB-D camera. For semantic segmentation, we focused on local geometric information not used in PointNet, and we proposed a method to refine the class probability of labels attached to each point in PointNet’s output. The effectiveness of our method was experimentally confirmed. We then created 3D models of real-world furniture using a point cloud with corrected labels, and we virtually manipulated real living space using Dollhouse VR, a layout system.

  • Conference Article
  • 10.1109/ipec54454.2022.9777336
GFFS: Gravitational-Force Feature Real Time Segmentation of 3D Point Cloud
  • Apr 14, 2022
  • Chunhao Shi + 6 more

Segmentation of point cloud is a momentous technology for autonomous vehicles. Due to the diversity and complexity of road environment, instance segmentation based on point cloud is a challenging task. Addressing this challenge, a gravitational-force feature segmentation of 3D point cloud (GFFS) is proposed. Firstly it divides the point into areas, secondly it sets the radius of the spherical neighborhood corresponding to each area based on the size and density of point cloud, thirdly it calculates the gravitational forces between the space of each spherical neighborhood and the barycenter of the area. GFFS performs instance segmentation of all kinds of objects in point cloud scene by analyzing the magnitude and continuity of gravitational forces. Finally, the superiority and effectiveness of the algorithm are verified by experiments.

  • Conference Article
  • Cite Count Icon 41
  • 10.1145/3394171.3413829
CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention
  • Oct 12, 2020
  • Xin Wen + 3 more

3D Semantic-Instance Segmentation (SIS) is a newly emerging research direction that aims to understand visual information of 3D scene on both semantic and instance level. The main difficulty lies in how to coordinate the paradox between mutual aid and sub-optimal problem. Previous methods usually address the mutual aid between instances and semantics by direct feature fusion or hand-crafted constraints to share the common knowledge of the two tasks. However, they neglect the abundant common knowledge of feature context in the feature space. Moreover, the direct feature fusion can raise the sub-optimal problem, since the false prediction of instance object can interfere the prediction of the semantic segmentation and vice versa. To address the above two issues, we propose a novel network of feature context fusion for SIS task, named CF-SIS. The idea is to associatively learn semantic and instance segmentation of 3D point clouds by context fusion with attention in the feature space. Our main contributions are two context fusion modules. First, we propose a novel inter-task context fusion module to take full advantage of mutual aid and relive the sub-optimal problem. It extracts the context in feature space from one task with attention, and selectively fuses the context into the other task using a gate fusion mechanism. Then, in order to enhance the mutual aid effect, the intra-task context fusion module is designed to further integrate the fused context, by selectively merging the similar feature through the self-attention mechanism. We conduct experiments on the S3DIS and ShapeNet datasets and show that CF-SIS outperforms the state-of-the-art methods on semantic and instance segmentation task.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 7
  • 10.1016/j.isprsjprs.2023.05.018
Semantic segmentation of mobile mapping point clouds via multi-view label transfer
  • Jun 8, 2023
  • ISPRS Journal of Photogrammetry and Remote Sensing
  • Torben Peters + 2 more

We study how to learn semantic segmentation of 3D point clouds from small training sets. The problem arises because annotating 3D point clouds is a lot more time-consuming and error-prone than annotating 2D images. On the one hand this means that one cannot afford to create a large enough training dataset for each new project. On the other hand it also means that there is not nearly as much public data available as there is for images, which one could use to pretrain a generic feature extractor that could then, with only little dedicated training data, be adapted (“fine-tuned”) to the task at hand. To address this bottleneck we explore the possibility to transfer knowledge from the 2D image domain to 3D point clouds. That strategy is of particular interest for mobile mapping systems that capture both point clouds and images, in a fully calibrated setting that makes it easy to connect the two domains. We find that, as expected, naively segmenting in image space and mapping the resulting labels onto the point cloud is not sufficient, as visual ambiguities, residual calibration errors, etc. affect the result. Instead, we propose a system that learns to merge image evidence from a varying number viewpoint, and 3D geometry information, into a common representation that encodes point-wise 3D semantics. To validate our approach we make use of a new mobile mapping dataset with 88M annotated 3D points and 2205 oriented multi-view images. In a series of experiments, we show how much label noise is caused by simplistic label transfer, and how well existing semantic segmentation architectures can correct it. Finally, we demonstrate that adding our learned 2D-to-3D multi-view label transfer significantly improves the performance of different segmentation backbones.

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/3dv57658.2022.00025
Push-the-Boundary: Boundary-aware Feature Propagation for Semantic Segmentation of 3D Point Clouds
  • Sep 1, 2022
  • Shenglan Du + 4 more

Feedforward fully convolutional neural networks currently dominate in semantic segmentation of 3D point clouds. Despite their great success, they suffer from the loss of local information at low-level layers, posing significant challenges to accurate scene segmentation and precise object boundary delineation. Prior works either address this issue by post-processing or jointly learn object boundaries to implicitly improve feature encoding of the networks. These approaches often require additional modules which are difficult to integrate into the original architecture. To improve the segmentation near object boundaries, we propose a boundary-aware feature propagation mechanism. This mechanism is achieved by exploiting a multitask learning framework that aims to explicitly guide the boundaries to their original locations. With one shared encoder, our network outputs (i) boundary localization, (ii) prediction of directions pointing to the object's interior, and (iii) semantic segmentation, in three parallel streams. The predicted boundaries and directions are fused to propagate the learned features to refine the segmentation. We conduct extensive experiments on the S3DIS and SensatUrban datasets against various baseline methods, demonstrating that our proposed approach yields consistent improvements by reducing boundary errors. Our code is available at https://github.com/shenglandu/PushBoundary.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 10
  • 10.3390/rs13183621
Exploiting Structured CNNs for Semantic Segmentation of Unstructured Point Clouds from LiDAR Sensor
  • Sep 10, 2021
  • Remote Sensing
  • Muhammad Ibrahim + 3 more

Accurate semantic segmentation of 3D point clouds is a long-standing problem in remote sensing and computer vision. Due to the unstructured nature of point clouds, designing deep neural architectures for point cloud semantic segmentation is often not straightforward. In this work, we circumvent this problem by devising a technique to exploit structured neural architectures for unstructured data. In particular, we employ the popular convolutional neural network (CNN) architectures to perform semantic segmentation of LiDAR data. We propose a projection-based scheme that performs an angle-wise slicing of large 3D point clouds and transforms those slices into 2D grids. Accounting for intensity and reflectivity of the LiDAR input, the 2D grid allows us to construct a pseudo image for the point cloud slice. We enhance this image with low-level image processing techniques of normalization, histogram equalization, and decorrelation stretch to suit our ultimate object of semantic segmentation. A large number of images thus generated are used to induce an encoder-decoder CNN model that learns to compute a segmented 2D projection of the scene, which we finally back project to the 3D point cloud. In addition to a novel method, this article also makes a second major contribution of introducing the enhanced version of our large-scale public PC-Urban outdoor dataset which is captured in a civic setup with an Ouster LiDAR sensor. The updated dataset (PC-Urban_V2) provides nearly 8 billion points including over 100 million points labeled for 25 classes of interest. We provide a thorough evaluation of our technique on PC-Urban_V2 and three other public datasets.

  • Research Article
  • Cite Count Icon 11
  • 10.1049/cvi2.12250
A survey on weakly supervised 3D point cloud semantic segmentation
  • Nov 2, 2023
  • IET Computer Vision
  • Jingyi Wang + 3 more

With the popularity and advancement of 3D point cloud data acquisition technologies and sensors, research into 3D point clouds has made considerable strides based on deep learning. The semantic segmentation of point clouds, a crucial step in comprehending 3D scenes, has drawn much attention. The accuracy and effectiveness of fully supervised semantic segmentation tasks have greatly improved with the increase in the number of accessible datasets. However, these achievements rely on time‐consuming and expensive full labelling. In solve of these existential issues, research on weakly supervised learning has recently exploded. These methods train neural networks to tackle 3D semantic segmentation tasks with fewer point labels. In addition to providing a thorough overview of the history and current state of the art in weakly supervised semantic segmentation of 3D point clouds, a detailed description of the most widely used data acquisition sensors, a list of publicly accessible benchmark datasets, and a look ahead to potential future development directions is provided.

Save Icon
Up Arrow
Open/Close