NF-PCAC: Normalizing Flow Based Point Cloud Attribute Compression

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Learning-based point cloud (PC) compression is a promising research avenue to reduce the transmission and storage costs for PC applications. Existing learning-based methods to compress PCs have mainly focused on geometry and employ variational autoencoders to learn compact signal representations. However, autoencoders leverage low-dimensional bottlenecks that limit the maximum reconstruction quality, even at high bitrates. In this paper, we propose a different and novel approach to compress PC attributes by using normalizing flows. Since normalizing flows model invertible transforms, the proposed approach can achieve better reconstruction quality than variational autoencoders over a large range of bitrates. Our Normalizing Flow-based Point Cloud Attribute Compression (NF-PCAC) outperforms previous learning-based methods for attribute compression, and has comparable performance as G-PCC v.14, showing the potential of this scheme for PC compression.

Similar Papers
  • Conference Article
  • Cite Count Icon 6
  • 10.1109/vcip56404.2022.10008821
Augmented Normalizing Flow for Point Cloud Geometry Coding
  • Dec 13, 2022
  • Siao-Yu Li + 4 more

With the increased popularity of immersive media, point clouds have become one of the popular data representations for presenting 3D scenes. The huge amount of point cloud data poses a great challenge on their storage and real-time transmission, which calls for efficient point cloud compression. This paper presents a novel point cloud geometry compression technique based on learning end-to-end an augmented normalizing flow (ANF) model to represent the occupancy status of voxelized data points. The higher expressive power of ANF than variational autoencoders (V AE) is leveraged for the first time to represent binary occupancy status. Compared to two coding standards developed by MPEG, namely G-PCC (geometry-based point cloud compression) and V-PCC (video-based point cloud compression), our method achieves more than 80% and 30% bitrate reduction, respectively. Compared to several learning-based methods, our method also yields better performance.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tpami.2025.3594355
Deep Learning-Based Point Cloud Compression: An In-Depth Survey and Benchmark.
  • Nov 1, 2025
  • IEEE transactions on pattern analysis and machine intelligence
  • Wei Gao + 5 more

With the maturity of 3D capture technology, the explosive growth of point cloud data has burdened the storage and transmission process. Traditional hybrid point cloud compression (PCC) tools relying on handcrafted priors have limited compression performance and are increasingly weak in addressing the burden induced by data growth. Recently, deep learning-based PCC methods have been introduced to continue to push the PCC performance boundary. With the thriving of deep PCC, the community urgently demands a systematic overview to conclude the past progress and present future research directions. In this paper, we have a detailed review that covers popular point cloud datasets, algorithm evolution, benchmarking analysis, and future trends. Concretely, we first introduce several widely-used PCC datasets according to their major properties. Then the algorithm evolution of existing studies on deep PCC, including lossy ones and lossless ones proposed for various point cloud types, is reviewed. Apart from academic studies, we also investigate the development of relevant international standards (i.e., MPEG standards and JPEG standards). To help have an in-depth understanding of the advance of deep PCC, we select a representative set of methods and conduct extensive experiments on multiple datasets. Comprehensive benchmarking comparisons and analysis reveal the pros and cons of previous methods. Finally, based on the profound analysis, we highlight the challenges and future trends of deep learning-based PCC, paving the way for further study.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tgrs.2025.3573206
DAPCC: Diverse Attention-Based Entropy Model for Dynamic LiDAR Point Cloud Compression
  • Jan 1, 2025
  • IEEE Transactions on Geoscience and Remote Sensing
  • Mingyue Cui + 6 more

LiDAR point cloud (LPC) compression is an indispensable component for 3D vision tasks, especially for dynamic point clouds. However, the existing methods based on traditional spatial-temporal attention are immature, causing little improvement in inter-frame feature extraction. In this paper, we propose Diverse Attention-based Point Cloud Compression (DAPCC), an LPC compression entropy model combining aggregation embedding modules for temporal point matching and spatial-temporal attention blocks for dynamic Octree node encoding, which can effectively utilize the change information of dynamic point clouds. Specifically, we first introduce aggregation embedding to match the Octree sequences from two sweeps to establish temporal correlation. To effectively capture the feature details, we further design local and global combined attention for the spatial-temporal information of point clouds which can focus on the whole context. Finally, we organize a symmetric MLP module capable of strengthening vital features. We conduct experiments of static and dynamic compression on both indoor/outdoor point cloud benchmark datasets (<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i>, ScanNet, SemanticKITTI, and MPEG Common Test Conditions (CTC) Category 3 datasets) and downstream applications (<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i>, vehicle detection and semantic segmentation). Compared with the previous state-of-the-art methods, our method achieves up to 14.7% bpp and 45% decoding time savings and adapts to the downstream tasks with almost no impact on performance.

  • Research Article
  • Cite Count Icon 7
  • 10.1109/tcsvt.2021.3129071
Guest Editorial Introduction to the Special Issue on Recent Advances in Point Cloud Processing and Compression
  • Dec 1, 2021
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Zhu Li + 5 more

A point cloud is a set of 3D points that can be used to represent a 3D surface. Each point has a spatial position (x, y, z) and a vector of attributes, such as colors, material reflection, or normal. As point clouds are capable of reconstructing 3D objects or scenes, they have the potential to be widely used in various applications such as auto-driving and 6-degree virtual reality. However, the following properties of point cloud make the point cloud compression and processing become rather challenging. 1) Unstructured. The point cloud is a series of non-uniform sampled points. On the one hand, it makes the correlations among various points difficult to be utilized for compression. On the other hand, the convolutional neural network that is widely used in image/video processing cannot be applied to the point cloud processing. 2) Unordered. Unlike images and videos, the point cloud is a set of points without a specific order. Therefore, both the point cloud processing and compression algorithms need to be invariant to any permutations of the input point clouds.

  • Research Article
  • Cite Count Icon 6
  • 10.3390/s25061660
Three-Dimensional Point Cloud Applications, Datasets, and Compression Methodologies for Remote Sensing: A Meta-Survey.
  • Mar 7, 2025
  • Sensors (Basel, Switzerland)
  • Emil Dumic + 1 more

This meta-survey provides a comprehensive review of 3D point cloud (PC) applications in remote sensing (RS), essential datasets available for research and development purposes, and state-of-the-art point cloud compression methods. It offers a comprehensive exploration of the diverse applications of point clouds in remote sensing, including specialized tasks within the field, precision agriculture-focused applications, and broader general uses. Furthermore, datasets that are commonly used in remote-sensing-related research and development tasks are surveyed, including urban, outdoor, and indoor environment datasets; vehicle-related datasets; object datasets; agriculture-related datasets; and other more specialized datasets. Due to their importance in practical applications, this article also surveys point cloud compression technologies from widely used tree- and projection-based methods to more recent deep learning (DL)-based technologies. This study synthesizes insights from previous reviews and original research to identify emerging trends, challenges, and opportunities, serving as a valuable resource for advancing the use of point clouds in remote sensing.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tmm.2025.3565958
Hierarchical Distortion Learning for Fast Lossy Compression of Point Clouds
  • Jan 1, 2025
  • IEEE Transactions on Multimedia
  • Pengpeng Yu + 4 more

The growth of 3D point cloud applications requires efficient compression techniques for high-quality and low-latency services. Recently, learning-based point cloud compression models have made significant progress. However, geometric distortion resulting from downsampling limits the feature depth within large-scale point clouds, thereby constraining the receptive field and suppressing the redundant removal. Moreover, the issues of computational efficiency and reconstruction quality still persist in the compression of large-scale point clouds. To address these challenges, we propose a hierarchical distortion learning framework for end-to-end lossy compression of point clouds. First, we design a feature residual compression module to efficiently transmit shallow semantics between the encoder and the decoder, which enables a lightweight design of our framework. Second, we introduce a geometry residual compression module to progressively complement the reconstruction distortion, avoiding the accumulation of geometric distortion. By integrating these two modules and employing sufficient downsampling processes, we develop a high-performance framework with a significantly enlarged receptive field and low computational cost. Extensive experiments demonstrate that our method achieves state-ofthe- art performance in geometry lossy compression, while delivering competitive performance in joint geometry and color lossy compression with fast running speed. Code is available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/pengpeng-yu/FastPCC</uri>.

  • Research Article
  • 10.1109/tip.2025.3648141
Implicit Neural Compression of Point Clouds.
  • Jan 1, 2026
  • IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
  • Hongning Ruan + 5 more

Point clouds have gained prominence across numerous applications due to their ability to accurately represent 3D objects and scenes. However, efficiently compressing unstructured, high-precision point cloud data remains a significant challenge. In this paper, we propose NeRC ${}^{\textbf {3}}$ , a novel point cloud compression framework that leverages implicit neural representations (INRs) to encode both geometry and attributes of dense point clouds. Our approach employs two coordinate-based neural networks: one maps spatial coordinates to voxel occupancy, while the other maps occupied voxels to their attributes, thereby implicitly representing the geometry and attributes of a voxelized point cloud. The encoder quantizes and compresses network parameters alongside auxiliary information required for reconstruction, while the decoder reconstructs the original point cloud by inputting voxel coordinates into the neural networks. Furthermore, we extend our method to dynamic point cloud compression through techniques that reduce temporal redundancy, including a 4D spatio-temporal representation termed 4D-NeRC ${}^{\textbf {3}}$ . Experimental results validate the effectiveness of our approach: For static point clouds, NeRC ${}^{\textbf {3}}$ outperforms octree-based G-PCC standard and existing INR-based methods. For dynamic point clouds, 4D-NeRC ${}^{\textbf {3}}$ achieves superior geometry compression performance compared to the latest G-PCC and V-PCC standards, while matching state-of-the-art learning-based methods. It also demonstrates competitive performance in joint geometry and attribute compression.

  • Research Article
  • Cite Count Icon 17
  • 10.1145/3550454.3555481
3QNet
  • Nov 30, 2022
  • ACM Transactions on Graphics
  • Tianxin Huang + 7 more

Since the development of 3D applications, the point cloud, as a spatial description easily acquired by sensors, has been widely used in multiple areas such as SLAM and 3D reconstruction. Point Cloud Compression (PCC) has also attracted more attention as a primary step before point cloud transferring and saving, where the geometry compression is an important component of PCC to compress the points geometrical structures. However, existing non-learning-based geometry compression methods are often limited by manually pre-defined compression rules. Though learning-based compression methods can significantly improve the algorithm performances by learning compression rules from data, they still have some defects. Voxel-based compression networks introduce precision errors due to the voxelized operations, while point-based methods may have relatively weak robustness and are mainly designed for sparse point clouds. In this work, we propose a novel learning-based point cloud compression framework named 3D Point Cloud Geometry Quantiation Compression Network (3QNet), which overcomes the robustness limitation of existing point-based methods and can handle dense points. By learning a codebook including common structural features from simple and sparse shapes, 3QNet can efficiently deal with multiple kinds of point clouds. According to experiments on object models, indoor scenes, and outdoor scans, 3QNet can achieve better compression performances than many representative methods.

  • Research Article
  • 10.1109/tip.2026.3676604
Rate-Reconfigurable Deep Point Cloud Compression With Perceptual Bit Allocation Optimization.
  • Jan 1, 2026
  • IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
  • Yun Zhang + 5 more

Conventional end-to-end learning-based point cloud compression requires training multiple models to adapt to different target bit rates. Moreover, the rate difference between geometry and attribute components of point clouds is not well-considered. In this paper, we propose an end-to-end Rate-Reconfigurable Deep Point Cloud Compression (RR-DPCC) with on/off-line Perceptual Bit Allocation Optimization (PBAO-ON/OFF), which achieves arbitrary bit rate control with one trained deep model and high efficiency joint geometry and attribute coding. First, we propose the framework of the RR-DPCC using PBAO-ON/OFF, which includes Point Cloud Quality Assessment (PCQA) for perceptual quality measurement, PBAO-ON/OFF modules for bit allocation and RR-DPCC for high efficiency point cloud coding. Second, we propose a one-stream network of the RR-DPCC to encode the attribute and geometry of point clouds jointly. Moreover, in RR-DPCC, a bitrate reconfigurable module is proposed to encode multiple fine-grained bitrate points with one trained model and a rate allocation module is proposed to allocate bits between geometry and attribute. Third, we propose on/off-line PBAO algorithms to maximize the perceptual quality of the reconstructed point cloud, where the bits are properly allocated based on the importance of geometry and attribute. Meanwhile, rate-distortion models (R- $\alpha $ / $\beta $ and D- $\alpha $ / $\beta $ ) are derived for high accuracy rate control and bit allocation. Experimental results show that the proposed RR-DPCC achieves fine-grained bitrate control and allocation through a single trained model. When combined the proposed RR-DPCC with PBAO-ON, it reduces -6.56% and -18.68% bit rate on average as comparing with the state-of-the-art V-PCC and Deep Joint Geometry and Attribute Compression (Deep-JGAC), respectively. When combined with the PBAO-OFF, it achieves -4.90% and -15.34% bit rate reductions on average, and reduces 98.38%/22.05% and 53.75%/10.04% encoding/decoding time on average with respect to V-PCC and Deep-JGAC.

  • Research Article
  • Cite Count Icon 14
  • 10.1109/tim.2023.3290291
Pseudo-Reference Point Cloud Quality Measurement Based on Joint 2-D and 3-D Distortion Description
  • Jan 1, 2023
  • IEEE Transactions on Instrumentation and Measurement
  • Renwei Tu + 5 more

Point cloud (PC) compression inevitably introduces distortion during communication, which can affect users’ visual experience. Thus, efficient point cloud quality measurement (PCQM) tools are highly desired to measure the PC’s visual quality. In this paper, a pseudo-reference PCQM metric based on joint two-dimensional (2D) and three-dimensional (3D) distortion description is proposed. In 2D description, aiming at the visual quality degradation reflected cooperatively by PC’s texture distortion and geometry distortion, a joint texture-geometry distribution with texture projection map and geometry projection map of the video-based point cloud compression (V-PCC) standard is constructed to measure the joint texture-geometry distortion of PC. Since the geometry distortion of PC results in the similar distortion phenomena in the geometry projection map and texture projection map, a self-reference geometry-texture structural similarity (SGT-SSIM) is proposed. The separate statistical features of texture projection map and geometry projection map are also considered. In 3D description, considering the limitations of using full-reference metric and the difficulty of directly reflecting PC cracks and outliers only by the V-PCC projection, a pseudo-reference PC is constructed by performing Poisson surface reconstruction on the distorted PC. Then, the point-to-distribution is used to directly characterize pseudo-referenced geometry distortion, while gray level-gradient co-occurrence matrix based on key points of PC is constructed to measure the texture distortion. Finally, the features with joint 2D and 3D distortion description are combined to measure the PC visual quality more comprehensively. Experimental results on five PC datasets demonstrate that the proposed metric has comparable performance to the existing full-reference metrics.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/ispacs51563.2021.9651108
Progressive Point Cloud Compression with the Fusion of Symmetry Based Convolutional Neural Pyramid and Vector Quantization
  • Nov 16, 2021
  • Gokulnath Vadivel + 3 more

In this paper, we present a novel 3D structure-awareness image-based point cloud compression scheme, which applies the proposed Symmetry based Convolutional Neural Pyramid (SCNP) to compress colored point clouds view-by-view for 3D model transmission. Input a 3D model to the system, a preprocessing step is first applied to represent the input point cloud as a sequence of view-specific six-dimensional (6D) images, where each pixel is characterized by an RGB color vector and a XYZ 3D point. The transformed 6D images preserve the regular grid structure and thus the redundant information is easy to be removed by conventional image/video compression techniques. Our SCNP first represents each 6D image as a multiple-level pyramid structure for progressively compressing and transmission. The lowest resolution image at the highest level of the pyramid is then decomposed into multiple patches with each of them being coded as the index of a small dictionary through vector quantization. The residual images at other levels are also represented by the vector quantization codes with different patch sizes for progressively reconstructing the input colored point cloud. This process results in a multiple description coding scheme for 3D point cloud compression. With the pre-learned set of dictionaries, the projected view-specific 6D images of the input 3D model are encoded one-by-one to obtain the compressed results for 3D model transmission. In the receiver end, the 3D model is reconstructed by merging all the reconstructed point clouds where each of them is decoded from the corresponding view-specific image. Finally, the conventional 3D reconstruction approach has been applied to remove redundant 3D points for reconstructing the 3D model. Experiments demonstrate the effectiveness of our approach which attains better performance than the current state-of-the-art point cloud compression methods.

  • Research Article
  • Cite Count Icon 25
  • 10.1109/tmm.2022.3154927
Quantitative Comparison of Point Cloud Compression Algorithms With PCC Arena
  • Jan 1, 2023
  • IEEE Transactions on Multimedia
  • Cheng-Hao Wu + 5 more

With the growth of Extended Reality (XR) and capturing devices, point cloud representation has become attractive to academics and industry. Point Cloud Compression (PCC) algorithms further promote numerous XR applications that may change our daily life. However, in the literature, PCC algorithms are often evaluated with heterogeneous datasets, metrics, and parameters, making the results hard to interpret. In this article, we propose an open-source benchmark platform called PCC Arena. Our platform is modularized in three aspects: PCC algorithms, point cloud datasets, and performance metrics. Users can easily extend PCC Arena in each aspect to fulfill the requirements of their experiments. To show the effectiveness of PCC Arena, we integrate seven PCC algorithms into PCC Arena along with six point cloud datasets. We then compare the algorithms on ten carefully selected metrics to evaluate the quality of the output point clouds. We further conduct a user study to quantify the user-perceived quality of rendered images that are produced by different PCC algorithms. Several novel insights are revealed in our comparison: (i) Signal Processing (SP)-based PCC algorithms are stable for different usage scenarios, but the trade-offs between coding efficiency and quality should be carefully addressed, (ii) Neural Network (NN)-based PCC algorithms have the potential to consume lower bitrates yet provide similar results to SP-based algorithms, (iii) NN-based PCC algorithms may generate artifacts and suffer from long running time, and (iv) NN-based PCC algorithms are worth more in-depth studies as the recently proposed NN-based PCC algorithms improve the quality and running time. We believe that PCC Arena can play an essential role in allowing engineers and researchers to better interpret and compare the performance of future PCC algorithms.

  • Research Article
  • 10.1109/jsac.2025.3623164
Over-the-Air Learning-based Geometry Point Cloud Transmission
  • Jan 1, 2025
  • IEEE Journal on Selected Areas in Communications
  • Chenghong Bian + 2 more

3D point cloud is a three-dimensional data format generated by LiDARs and depth sensors, and is being increasingly used in a large variety of applications from autonomous vehicles to robotics and metaverse. This paper presents novel solutions for the efficient and reliable transmission of point clouds over wireless channels for real-time applications. We first propose SEmatic Point cloud Transmission (SEPT) for small-scale point clouds, which encodes the point cloud via an iterative downsampling and feature extraction process. At the receiver, SEPT decoder reconstructs the point cloud with latent reconstruction and offset-based upsampling. A novel channel-adaptive module is proposed to allow SEPT to operate effectively over a wide range of channel conditions. Next, we propose OTA-NeRF, a scheme inspired by neural radiance fields. OTA-NeRF performs voxelization to the point cloud input and learns to encode the voxelized point cloud into a neural network. Instead of transmitting the extracted feature vectors as in SEPT, it transmits the learned neural network weights over the air in an analog fashion along with few hyperparameters that are transmitted digitally. At the receiver, the OTA-NeRF decoder reconstructs the original point cloud using the received noisy neural network weights. To further increase the bandwidth efficiency of the OTA-NeRF scheme, a fine-tuning algorithm is developed, where only a fraction of the neural network weights are retrained and transmitted. Noticing the poor generality of the OTA-NeRF schemes where the neural network weights are trained for a specific point cloud, we propose an alternative approach, termed OTA-MetaNeRF, which encodes different input point clouds into the latent vectors with shared neural network weights. Extensive numerical experiments confirm that the proposed SEPT, OTA-NeRF and OTA-MetaNeRF schemes achieve superior or comparable performance over the conventional approaches, where an octree-based or a learning-based point cloud compression scheme is concatenated with a channel code. As an additional advantage, all schemes mitigate the cliff and leveling effects making them particularly attractive for highly mobile scenarios. Finally, the run-time complexities of the schemes are evaluated to verify the capability of the proposed schemes for real-time communications.

  • Research Article
  • 10.1049/ell2.13080
A task‐driven sampling method based on graph convolution for 3D point cloud compression
  • Jan 1, 2024
  • Electronics Letters
  • Yakun Yang + 3 more

The previous point cloud compression methods only consider reducing the amount of data. However, in applications such as autonomous driving, the compression methods not only require smooth transmission, but also ensure the efficiency of downstream tasks. To this end, a task‐driven sampling network based on graph convolution is proposed to achieve point cloud compression and recovery. First, a downsampling network is presented to simplify and compress the point cloud, in order to optimize the compressed point cloud for downstream tasks, the task loss is added to loss function for end‐to‐end training. Then, an upsampling network with residual correction unit is presented to recover and reconstruct the point cloud. Experiments for point cloud classification task on ModelNet40 dataset show that the compressed point cloud obtained through our network can achieve higher classification accuracy compared to other similar methods, and the reconstructed point cloud can further improve classification accuracy.

  • Conference Article
  • Cite Count Icon 65
  • 10.1109/cvpr42600.2020.00105
SSRNet: Scalable 3D Surface Reconstruction Network
  • Jun 1, 2020
  • Zhenxing Mi + 2 more

Existing learning-based surface reconstruction methods from point clouds are still facing challenges in terms of scalability and preservation of details on large-scale point clouds. In this paper, we propose the SSRNet, a novel scalable learning-based method for surface reconstruction. The proposed SSRNet constructs local geometry-aware features for octree vertices and designs a scalable reconstruction pipeline, which not only greatly enhances the predication accuracy of the relative position between the vertices and the implicit surface facilitating the surface reconstruction quality, but also allows dividing the point cloud and octree vertices and processing different parts in parallel for superior scalability on large-scale point clouds with millions of points. Moreover, SSRNet demonstrates outstanding generalization capability and only needs several surface data for training, much less than other learning-based reconstruction methods, which can effectively avoid overfitting. The trained model of SSRNet on one dataset can be directly used on other datasets with superior performance. Finally, the time consumption with SSRNet on a large-scale point cloud is acceptable and competitive. To our knowledge, the proposed SSRNet is the first to really bring a convincing solution to the scalability issue of the learning-based surface reconstruction methods, and is an important step to make learning-based methods competitive with respect to geometry processing methods on real-world and challenging data. Experiments show that our method achieves a breakthrough in scalability and quality compared with state-of-the-art learning-based methods.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant