Point Cloud Geometry Scalable Coding with a Quality-Conditioned Latents Probability Estimator

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

The widespread usage of point clouds (PC) for immersive visual applications has resulted in the use of very heterogeneous receiving conditions and devices, notably in terms of network, hardware, and display capabilities. In this scenario, quality scalability, i.e., the ability to reconstruct a signal at different qualities by progressively decoding a single bitstream, is a major requirement that has yet to be conveniently addressed, notably in most learning-based PC coding solutions. This paper proposes a quality scalability scheme, named Scalable Quality Hyperprior (SQH), adaptable to learningbased static point cloud geometry codecs, which uses a Qualityconditioned Latents Probability Estimator (QuLPE) to decode a high-quality version of a PC learning-based representation, based on an available lower quality base layer. SQH is integrated in the future JPEG PC coding standard, allowing to create a layered bitstream that can be used to progressively decode the PC geometry with increasing quality and fidelity. Experimental results show that SQH offers the quality scalability feature with very limited or no compression performance penalty at all when compared with the corresponding non-scalable solution, thus preserving the significant compression gains over other state-of-the-art PC codecs.

Similar Papers
  • Conference Article
  • Cite Count Icon 21
  • 10.1109/icip40778.2020.9191021
Point Cloud Geometry Scalable Coding With a Single End-to-End Deep Learning Model
  • Oct 1, 2020
  • Andre F R Guarda + 2 more

Point clouds are gaining importance as the format to represent complex 3D objects and scenes, offering high user immersion and interaction, although at the cost of requiring massive data. Scalable coding is an important feature for point cloud coding, especially for real-time applications, where the fast and bitrate efficient access to a decoded point cloud is important; however, this issue is still rather unexplored in the literature. With the rise of deep learning methods as a promising solution for efficient coding, this paper proposes the first deep learning-based point cloud geometry scalable coding solution. Experimental results show that the proposed scalable coding solution consistently outperforms the MPEG standard for static point cloud geometry coding. In this way, a new research path is open for point cloud scalable coding technology.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/icip42928.2021.9506448
Cylindrical Coordinates for Lidar Point Cloud Compression
  • Sep 19, 2021
  • Shashank N Sridhara + 2 more

We present an efficient voxelization method to encode the geometry and attributes of 3D point clouds obtained from autonomous vehicles. Due to the circular scanning trajectory of sensors, the geometry of LiDAR point clouds is inherently different from that of point clouds captured from RGBD cameras. Our method exploits these specific properties to representing points in cylindrical coordinates instead of conventional Cartesian coordinates. We demonstrate that Region Adaptive Hierarchical Transform (RAHT) can be extended to this setting, leading to attribute encoding based on a volumetric partition in cylindrical coordinates. Experimental results show that our proposed voxelization outperforms conventional approaches based on Cartesian coordinates for this type of data. We observe a significant improvement in attribute coding performance with 5-10% reduction in bitrate and octree representation with 35-45% reduction in bits.

  • Research Article
  • Cite Count Icon 65
  • 10.1109/tip.2022.3180904
PU-Dense: Sparse Tensor-Based Point Cloud Geometry Upsampling.
  • Jan 1, 2022
  • IEEE Transactions on Image Processing
  • Anique Akhtar + 4 more

Due to the increased popularity of augmented and virtual reality experiences, the interest in capturing high-resolution real-world point clouds has never been higher. Loss of details and irregularities in point cloud geometry can occur during the capturing, processing, and compression pipeline. It is essential to address these challenges by being able to upsample a low Level-of-Detail (LoD) point cloud into a high LoD point cloud. Current upsampling methods suffer from several weaknesses in handling point cloud upsampling, especially in dense real-world photo-realistic point clouds. In this paper, we present a novel geometry upsampling technique, PU-Dense, which can process a diverse set of point clouds including synthetic mesh-based point clouds, real-world high-resolution point clouds, real-world indoor LiDAR scanned objects, as well as outdoor dynamically acquired LiDAR-based point clouds. PU-Dense employs a 3D multiscale architecture using sparse convolutional networks that hierarchically reconstruct an upsampled point cloud geometry via progressive rescaling and multiscale feature extraction. The framework employs a UNet type architecture that downscales the point cloud to a bottleneck and then upscales it to a higher level-of-detail (LoD) point cloud. PU-Dense introduces a novel Feature Extraction Unit that incorporates multiscale spatial learning by employing filters at multiple sampling rates and receptive fields. The architecture is memory efficient and is driven by a binary voxel occupancy classification loss that allows it to process high-resolution dense point clouds with millions of points during inference time. Qualitative and quantitative experimental results show that our method significantly outperforms the state-of-the-art approaches by a large margin while having much lower inference time complexity. We further test our dataset on high-resolution photo-realistic datasets. In addition, our method can handle noisy data well. We further show that our approach is memory efficient compared to the state-of-the-art methods.

  • Research Article
  • 10.5194/isprs-annals-x-1-2024-123-2024
A Weakly Supervised Vehicle Detection Method from LiDAR Point Clouds
  • May 9, 2024
  • ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • Yiyuan Li + 5 more

Abstract. Training LiDAR point clouds object detectors requires a significant amount of annotated data, which is time-consuming and effort-demanding. Although weakly supervised 3D LiDAR-based methods have been proposed to reduce the annotation cost, their performance could be further improved. In this work, we propose a weakly supervised LiDAR-based point clouds vehicle detector that does not require any labels for the proposal generation stage and needs only a few labels for the refinement stage. It comprises two primary modules. The first is an unsupervised proposal generation module based on the geometry of point clouds. The second is the pseudo-label refinement module. We validate our method on two point clouds based object detection datasets, namely KITTI and ONCE, and compare it with various existing weakly supervised point clouds object detection methods. The experimental results demonstrate the method’s effectiveness with a small amount of labeled LiDAR point clouds.

  • Research Article
  • Cite Count Icon 65
  • 10.1109/tcsvt.2021.3100279
Lossless Coding of Point Cloud Geometry Using a Deep Generative Model
  • Dec 1, 2021
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Dat Thanh Nguyen + 3 more

This paper proposes a lossless point cloud (PC) geometry compression method that uses neural networks to estimate the probability distribution of voxel occupancy. First, to take into account the PC sparsity, our method adaptively partitions a point cloud into multiple voxel block sizes. This partitioning is signalled via an octree. Second, we employ a deep auto-regressive generative model to estimate the occupancy probability of each voxel given the previously encoded ones. We then employ the estimated probabilities to code efficiently a block using a context-based arithmetic coder. Our context has variable size and can expand beyond the current block to learn more accurate probabilities. We also consider using data augmentation techniques to increase the generalization capability of the learned probability models, in particular in the presence of noise and lower-density point clouds. Experimental evaluation, performed on a variety of point clouds from four different datasets and with diverse characteristics, demonstrates that our method reduces significantly (by up to 37%) the rate for lossless coding compared to the state-of-the-art MPEG codec.

  • Conference Article
  • Cite Count Icon 19
  • 10.1109/icra.2017.7989523
O-POCO: Online point cloud compression mapping for visual odometry and SLAM
  • May 1, 2017
  • Luis Contreras + 1 more

This paper presents O-POCO, a visual odometry and SLAM system that makes online decisions regarding what to map and what to ignore. It takes a point cloud from classical SfM and aims to sample it on-line by selecting map features useful for future 6D relocalisation. We use the camera's traveled trajectory to compartamentalize the point cloud, along with visual and spatial information to sample and compress the map. We propose and evaluate a number of different information layers such as the descriptor information's relative entropy, map-feature occupancy grid, and the point cloud's geometry error. We compare our proposed system against both SfM, and online and offline ORB-SLAM using publicly available datasets in addition to our own. Results show that our online compression strategy is capable of outperforming the baseline even for conditions when the number of features per key-frame used for mapping is four times less.

  • Conference Article
  • Cite Count Icon 24
  • 10.1145/3304109.3306224
Using neighbouring nodes for the compression of octrees representing the geometry of point clouds
  • Jun 18, 2019
  • Sébastien Lasserre + 2 more

The geometry of a point cloud is commonly represented by an octree recursively decomposing a 3D volume into eight child sub-volumes. Said volumes and sub-volumes are associated with nodes and child-nodes of the octree. The geometry is defined by the occupancy information indicating the presence or not of a point in each of the sub-volumes. This naturally leads to an eight-bit occupancy information to be coded for each internal node of the tree. This paper introduces a new binarization scheme to efficiently compress the occupancy information using an optimal set of binary entropy coders. Then, it is shown how using the occupancy information of neighbouring nodes helps to compress the occupancy bits associated with the child nodes of the current node. This information is used to contextualise the binarization scheme by computing, firstly a neighbour configuration, secondly a number of neighbours with occupied child nodes adjacent to the current child node, and thirdly an intra predictor. Objective results show lossless geometry compression gains between 60% and 75% on virtual reality oriented dense point clouds used by MPEG, reaching sub-bit per point bit-rates for the lossless intra coding of such point clouds. Solid gains (between 5% and 25% depending upon the sampling) are also observed on sparse point clouds captured by a LiDAR (Light Detection and Ranging) device attached to a moving vehicle or representing 3D maps.

  • Research Article
  • Cite Count Icon 63
  • 10.1016/j.compag.2020.105818
Pose estimation and adaptable grasp configuration with point cloud registration and geometry understanding for fruit grasp planning
  • Oct 7, 2020
  • Computers and Electronics in Agriculture
  • Ning Guo + 4 more

Pose estimation and adaptable grasp configuration with point cloud registration and geometry understanding for fruit grasp planning

  • Research Article
  • Cite Count Icon 4
  • 10.11834/jig.230004
Scene point cloud understanding and reconstruction technologies in 3D space
  • Jan 1, 2023
  • Journal of Image and Graphics
  • Jingyu Gong + 8 more

3维场景理解与重建技术能够使计算机对真实场景进行高精度复现并引导机器以3维空间的思维理解整个真实世界,从而使机器拥有足够智能参与到真实世界的生产与建设,并能通过场景的模拟为人类的决策和生活提供服务。3维场景理解与重建技术主要包含场景点云特征提取、扫描点云配准与融合、场景理解与语义分割、扫描物体点云补全与细粒度重建等,在处理真实扫描场景时,受到扫描设备、角度、距离以及场景复杂程度的影响,对技术的精准度和稳定性提出了更高的要求,相关的技术也十分具有挑战性。其中,原始扫描点云特征提取与配准融合旨在将同场景下多个扫描区域进行特征匹配,从而融合得到完整的场景点云,是理解与重建技术的基石;场景点云的理解与语义分割的目的在于对场景模型进行整体感知并根据语义特征划分为功能性物体甚至是部件的点云,是整套技术的核心组成部分;后续的物体点云细粒度补全主要研究扫描物体的结构恢复和残缺部分补全,是场景物体点云细粒度重建的关键性技术。本文围绕上述系列技术,详细分析了基于3维点云的场景理解与重建技术相关的应用领域和研究方向,归结总结了国内外的前沿进展与研究成果,对未来的研究方向和技术发展进行了展望。

  • Research Article
  • 10.1109/tip.2025.3648141
Implicit Neural Compression of Point Clouds.
  • Jan 1, 2026
  • IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
  • Hongning Ruan + 5 more

Point clouds have gained prominence across numerous applications due to their ability to accurately represent 3D objects and scenes. However, efficiently compressing unstructured, high-precision point cloud data remains a significant challenge. In this paper, we propose NeRC ${}^{\textbf {3}}$ , a novel point cloud compression framework that leverages implicit neural representations (INRs) to encode both geometry and attributes of dense point clouds. Our approach employs two coordinate-based neural networks: one maps spatial coordinates to voxel occupancy, while the other maps occupied voxels to their attributes, thereby implicitly representing the geometry and attributes of a voxelized point cloud. The encoder quantizes and compresses network parameters alongside auxiliary information required for reconstruction, while the decoder reconstructs the original point cloud by inputting voxel coordinates into the neural networks. Furthermore, we extend our method to dynamic point cloud compression through techniques that reduce temporal redundancy, including a 4D spatio-temporal representation termed 4D-NeRC ${}^{\textbf {3}}$ . Experimental results validate the effectiveness of our approach: For static point clouds, NeRC ${}^{\textbf {3}}$ outperforms octree-based G-PCC standard and existing INR-based methods. For dynamic point clouds, 4D-NeRC ${}^{\textbf {3}}$ achieves superior geometry compression performance compared to the latest G-PCC and V-PCC standards, while matching state-of-the-art learning-based methods. It also demonstrates competitive performance in joint geometry and attribute compression.

  • Research Article
  • Cite Count Icon 1
  • 10.5194/isprs-annals-x-1-w1-2023-597-2023
ASSESSING THE ALIGNMENT BETWEEN GEOMETRY AND COLORS IN TLS COLORED POINT CLOUDS
  • Dec 5, 2023
  • ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • Z Wang + 3 more

Abstract. The integration of the color information from RGB cameras with the point cloud geometry is used in numerous applications. However, little attention has been paid on errors that occur when aligning colors to points in terrestrial laser scanning (TLS) point clouds. Such errors may impact the performance of algorithms that utilize colored point clouds. Herein, we propose a procedure for assessing the alignment between the TLS point cloud geometry and colors. The procedure is based upon identifying artificial targets observed in both LiDAR-based point cloud intensity data and camera-based RGB data, and quantifying the quality of the alignment using differences between the target center coordinates estimated separately from these two data sources. Experimental results with eight scanners show that the quality of the alignment depends on the scanner, the software used for colorizing the point clouds, and may change with changing environmental conditions. While we found the effects of misalignment to be negligible for some scanners, they exhibited clearly systematic patterns exceeding the beam divergence, image and scan resolution for four of the scanners. The maximum deviations were about 2 mrad perpendicular to the line-of-sight when colorizing the point clouds with the respective manufacturer’s software or scanner in-built functions, while they were up to about 5 mrad when using a different software. Testing the alignment quality, e.g., using the approach presented herein, is thus important for applications requiring accurate alignment of the RGB colors with the point cloud geometry.

  • Research Article
  • Cite Count Icon 28
  • 10.1109/tcsvt.2021.3101852
Lossy Point Cloud Geometry Compression via Region-Wise Processing
  • Dec 1, 2021
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Wenjie Zhu + 4 more

Point cloud geometry (PCG) is used to precisely represent arbitrary-shaped 3D objects and scenes, is of great interest to vast applications which puts forward the pressing desire of high-efficiency PCG compression for transmission and storage. Existing PCG coding mostly relies on the octree model by which point-wise processing is applied without exploring nonlocal regional geometry similarity across the entire 3D surface. This work, instead, suggests the region-wise processing to leverage the region similarity to exploit inter-region redundancy for efficient lossy point cloud geometry compression. Towards this goal, a given PCG is first segmented into numerous local regions each of which comprises a portion of point cloud surface, and can be represented by a surface vector that describes the geometry shape numerically in a projected principal space. Subsequently, these regions are grouped into several discriminative clusters, assuring that inter-cluster similarity is minimized and intra-cluster similarity is maximized simultaneously, where the similarity is calculated using the regional surface vectors. In each cluster, we set a reference region having the largest similarity score to the others, which enables the non-reference region prediction from the reference one using alignment transform. In the end, we encode the reference regions directly using the lossless mode of the Geometry-based Point Cloud Compression (G-PCC), while corresponding non-reference regions are signaled using associated transform parameters. Compared with the state-of-the-art G-PCC using octree model, our region-wise approach can offer remarkable coding efficiency improvement, e.g., 32.4% and 22.0% Bjontegaard-delta rate (BD-Rate) gains for respective point-to-point ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$D1$ </tex-math></inline-formula> ) and point-to-plane ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$D2$ </tex-math></inline-formula> ) distortion evaluations, across a variety of common test sequences used in standard committee.

  • Conference Article
  • Cite Count Icon 15
  • 10.1109/icip42928.2021.9506333
Dynamic Point Cloud Geometry Compression using Cuboid based Commonality Modeling Framework
  • Sep 19, 2021
  • Ashek Ahmmed + 3 more

Point cloud in its uncompressed format require very high data rate for storage and transmission. The video based point cloud compression (V-PCC) technique projects a dynamic point cloud into geometry and texture video sequences. The projected geometry and texture video frames are then encoded using modern video coding standard like HEVC. However, HEVC encoder is unable to exploit the global commonality that exists within a geometry frame and between successive geometry frames to a greater extent. This is because in HEVC, the current frame partitioning starts from a rigid 64 × 64 pixels level without considering the structure of the scene need be coded. In this paper, an improved commonality modeling framework is proposed, by leveraging on cuboid-based frame partitioning, to encode point cloud geometry frames. The associated frame-partitioning scheme is based on statistical properties of the current geometry frame and therefore yields a flexible block partitioning structure composed of cuboids. Additionally, the proposed commonality modeling approach is computationally efficient and has a compact representation. Experimental results show that if the V-PCC reference encoder is augmented by the proposed commonality modeling technique, a bit rate savings of 2.71% and 4.25% are achieved for full body and upper body of human point clouds’ geometry sequences respectively. © 2021 IEEE.

  • Research Article
  • Cite Count Icon 20
  • 10.1109/tcsvt.2022.3223898
Rate-Distortion Modeling for Bit Rate Constrained Point Cloud Compression
  • May 1, 2023
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Pan Gao + 2 more

As being one of the main representation formats of 3D real world and well-suited for virtual reality and augmented reality applications, point clouds have gained a lot of popularity. In order to reduce the huge amount of data, a considerable amount of research on point cloud compression has been done. However, given a target bit rate, how to properly choose the color and geometry quantization parameters for compressing point clouds is still an open issue. In this paper, we propose a rate-distortion model based quantization parameter selection scheme for bit rate constrained point cloud compression. Firstly, to overcome the measurement uncertainty in evaluating the distortion of the point clouds, we propose a unified model to combine the geometry distortion and color distortion. In this model, we take into account the correlation between geometry and color variables of point clouds and derive a dimensionless quantity to represent the overall quality degradation. Then, we derive the relationships of overall distortion and bit rate with the quantization parameters. Finally, we formulate the bit rate constrained point cloud compression as a constrained minimization problem using the derived polynomial models and deduce the solution via an iterative numerical method. Experimental results show that the proposed algorithm can achieve optimal decoded point cloud quality at various target bit rates, and substantially outperform the video-rate-distortion model based point cloud compression scheme.

  • Conference Article
  • Cite Count Icon 82
  • 10.1109/qomex48832.2020.9123087
A Generalized Hausdorff Distance Based Quality Metric for Point Cloud Geometry
  • May 1, 2020
  • Alireza Javaheri + 3 more

Reliable quality assessment of decoded point cloud geometry is essential to evaluate the compression performance of emerging point cloud coding solutions and guarantee some target quality of experience. This paper proposes a novel point cloud geometry quality assessment metric based on a generalization of the Hausdorff distance. To achieve this goal, the so-called generalized Hausdorff distance for multiple rankings is exploited to identify the best performing quality metric in terms of correlation with the MOS scores obtained from a subjective test campaign. The experimental results show that the quality metric derived from the classical Hausdorff distance leads to low objective-subjective correlation and, thus, fails to accurately evaluate the quality of decoded point clouds for emerging codecs. However, the quality metric derived from the generalized Hausdorff distance with an appropriately selected ranking, outperforms the MPEG adopted geometry quality metrics when decoded point clouds with different types of coding distortions are considered.

Save Icon
Up Arrow
Open/Close