End-to-End Learned Lossy Dynamic Point Cloud Attribute Compression

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Recent advancements in point cloud compression have primarily emphasized geometry compression while comparatively fewer efforts have been dedicated to attribute compression. This study introduces an end-to-end learned dynamic lossy attribute coding approach, utilizing an efficient high-dimensional convolution to capture extensive inter-point dependencies. This enables the efficient projection of attribute features into latent variables. Subsequently, we employ a context model that leverage previous latent space in conjunction with an auto-regressive context model for encoding the latent tensor into a bitstream. Evaluation of our method on widely utilized point cloud datasets from the MPEG and Microsoft demonstrates its superior performance compared to the core attribute compression module Region-Adaptive Hierarchical Transform method from MPEG Geometry Point Cloud Compression with $38.1 \%$ Bjontegaard Delta-rate saving in average while ensuring a low-complexity encoding/decoding.

Similar Papers
  • Research Article
  • Cite Count Icon 7
  • 10.1109/ojsp.2022.3160392
Patch Re-Segmentation and Packing for Dynamic Point Cloud Compression via Back-and-Forth Structure
  • Jan 1, 2022
  • IEEE Open Journal of Signal Processing
  • Haoyu Shi + 1 more

The dynamic point cloud is widely needed in 3D vision related applications such as virtual reality and telepresence. Due to the huge amount of data, a key technology before the effective application is the dynamic point cloud compression. The state-of-the-art dynamic point cloud compression scheme, video-based point cloud compression (V-PCC), generates 2D videos with some uncorrelation due to the patch segmentation and packing process, which will affect the compression efficiency. In this paper, we propose a Packing with Patch Correlation Improvement (PPCI) algorithm to adaptively remove the uncorrelated parts between matched patches in packing for the sake of inter-prediction performance. We first propose a basic unidirectional patch re-segmentation operator to remove the uncorrelated parts of the patches in the current point cloud relative to the patches in its reference point cloud. The removed parts will be formed as new patches and added to the patch collection of the current point cloud. Then we propose a back-and-forth structure, which is a combination of several basic patch re-segmentation operators, to bilaterally remove the uncorrelated parts of matched patches in a back-and-forth (BF) unit. Furthermore, we propose a framework to adaptively decide the best length of each BF unit in a point cloud sequence. Experimental results show that our method achieves noticeable bitrate savings compared with the existing V-PCC packing methods, particularly for sequences with small motion.

  • Conference Article
  • Cite Count Icon 168
  • 10.1109/cvpr46437.2021.00598
VoxelContext-Net: An Octree based Framework for Point Cloud Compression
  • Jun 1, 2021
  • Zizheng Que + 2 more

In this paper, we propose a two-stage deep learning framework called VoxelContext-Net for both static and dynamic point cloud compression. Taking advantages of both octree based methods and voxel based schemes, our approach employs the voxel context to compress the octree structured data. Specifically, we first extract the local voxel representation that encodes the spatial neighbouring context information for each node in the constructed octree. Then, in the entropy coding stage, we propose a voxel context based deep entropy model to compress the symbols of non-leaf nodes in a lossless way. Furthermore, for dynamic point cloud compression, we additionally introduce the local voxel representations from the temporal neighbouring point clouds to exploit temporal dependency. More importantly, to alleviate the distortion from the octree construction procedure, we propose a voxel context based 3D coordinate refinement method to produce more accurate reconstructed point cloud at the decoder side, which is applicable to both static and dynamic point cloud compression. The comprehensive experiments on both static and dynamic point cloud benchmark datasets(e.g., ScanNet and Semantic KITTI) clearly demonstrate the effectiveness of our newly proposed method VoxelContext-Net for 3D point cloud geometry compression.

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.displa.2023.102528
An end-to-end dynamic point cloud geometry compression in latent space
  • Sep 14, 2023
  • Displays
  • Zhaoyi Jiang + 5 more

Dynamic point clouds are widely used for 3D data representation in various applications such as immersive and mixed reality, robotics and autonomous driving. However, their irregularity and large scale make efficient compression and transmission a challenge. Existing methods require high bitrates to encode point clouds since temporal correlation is not well considered. This paper proposes an end-to-end dynamic point cloud compression network that operates in latent space, resulting in more accurate motion estimation and more effective motion compensation. Specifically, a multi-scale motion estimation network is introduced to obtain accurate motion vectors. Motion information computed at a coarser level is upsampled and warped to the finer level based on cost volume analysis for motion compensation. Additionally, a residual compression network is designed to mitigate the effects of noise and inaccurate predictions by encoding latent residuals, resulting in smaller conditional entropy and better results. The proposed method achieves an average 12.09% and 14.76% (D2) BD-Rate gain over state-of-the-art Deep Dynamic Point Cloud Compression (D-DPCC) in experimental results. Compared to V-PCC, our framework showed an average improvement of 81.29% (D1) and 77.57% (D2).

  • Preprint Article
  • 10.52843/cassyni.sw7s3y
Point Cloud Compression, Super-Resolving and Deblocking
  • Nov 7, 2024

Due to the increased popularity of augmented and virtual reality experiences, as well as 3D sensing for auto-driving, the interest in capturing high resolution real-world point clouds has grown significantly in recent years. Point cloud is a new class of signal that is non-uniform and sparse and this present unique challenges to the signal processing, compression and learning problems. In this talk, we present our multi-scale sparse convolutional learning and Graph Frourier Transform (GFT) based framework for large scale point cloud processing, with applications to the geometry and attributes super-resolution, and dynamic point cloud compression with latent space compensation. The architecture is memory efficient and can learn deep networks to handle large scale point cloud in real world applications. Initial results demonstrated that this framework achieved new state of the art results in geometry super-resolution, attributes deblocking and super-resolving, and dynamic point cloud sequence compression.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 17
  • 10.3390/s22031262
A Method Based on Curvature and Hierarchical Strategy for Dynamic Point Cloud Compression in Augmented and Virtual Reality System †
  • Feb 7, 2022
  • Sensors (Basel, Switzerland)
  • Siyang Yu + 4 more

As a kind of information-intensive 3D representation, point cloud rapidly develops in immersive applications, which has also sparked new attention in point cloud compression. The most popular dynamic methods ignore the characteristics of point clouds and use an exhaustive neighborhood search, which seriously impacts the encoder’s runtime. Therefore, we propose an improved compression means for dynamic point cloud based on curvature estimation and hierarchical strategy to meet the demands in real-world scenarios. This method includes initial segmentation derived from the similarity between normals, curvature-based hierarchical refining process for iterating, and image generation and video compression technology based on de-redundancy without performance loss. The curvature-based hierarchical refining module divides the voxel point cloud into high-curvature points and low-curvature points and optimizes the initial clusters hierarchically. The experimental results show that our method achieved improved compression performance and faster runtime than traditional video-based dynamic point cloud compression.

  • Research Article
  • Cite Count Icon 60
  • 10.1109/tcsvt.2020.3015901
Predictive Generalized Graph Fourier Transform for Attribute Compression of Dynamic Point Clouds
  • Aug 18, 2020
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Yiqun Xu + 7 more

As 3D scanning devices and depth sensors advance, dynamic point clouds have attracted increasing attention as a format for 3D objects in motion, with applications in various fields such as immersive telepresence, navigation for autonomous driving and gaming. Nevertheless, the tremendous amount of data in dynamic point clouds significantly burden transmission and storage. To this end, we propose a complete compression framework for attributes of 3D dynamic point clouds, focusing on optimal inter-coding. Firstly, we derive the optimal inter-prediction and predictive transform coding assuming the Gaussian Markov Random Field model with respect to a spatio-temporal graph underlying the attributes of dynamic point clouds. The optimal predictive transform proves to be the Generalized Graph Fourier Transform in terms of spatio-temporal decorrelation. Secondly, we propose refined motion estimation via efficient registration prior to inter-prediction, which searches the temporal correspondence between adjacent frames of irregular point clouds. Finally, we present a complete framework based on the optimal inter-coding and our previously proposed intra-coding, where we determine the optimal coding mode from rate-distortion optimization with the proposed offline-trained λ-Q model. Experimental results show that we achieve around 17% bit rate reduction on average over competitive dynamic point cloud compression methods.

  • Research Article
  • Cite Count Icon 10
  • 10.3929/ethz-a-006731956
Dynamic Point Cloud Compression for Free Viewpoint Video
  • Jan 1, 2003
  • Repository for Publications and Research Data (ETH Zurich)
  • Edouard Lamboray + 4 more

In this paper, we present a coding framework addressing the compression of dynamic 3D point clouds which represent real world objects and which result from a video acquisiton using multiple cameras. The encoding is performed as an off-line process and is not time-critical. The decoding however, must allow for real-time rendering of the dynamic 3D point cloud. We introduce a compression framework which encodes multiple attributes like depth and color of 3D video fragments into progressive streams. The reference data structure is alighned on the original camera input images and thus allows for easy view-dependent decoding. The separate encoding of the object´ s silhouette allows the use of shape-adaptive compression algorithms. A novel differential coding approach permits random access in constant time throughout the complete data set and thus enables true free viewpoint video.

  • Conference Article
  • Cite Count Icon 33
  • 10.24963/ijcai.2022/126
D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction
  • Jul 1, 2022
  • Tingyu Fan + 4 more

The non-uniformly distributed nature of the 3D Dynamic Point Cloud (DPC) brings significant challenges to its high-efficient inter-frame compression. This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point Cloud Compression (D-DPCC) network to compensate and compress the DPC geometry with 3D motion estimation and motion compensation in the feature space. In the proposed D-DPCC network, we design a Multi-scale Motion Fusion (MMF) module to accurately estimate the 3D optical flow between the feature representations of adjacent point cloud frames. Specifically, we utilize a 3D sparse convolution-based encoder to obtain the latent representation for motion estimation in the feature space and introduce the proposed MMF module for fused 3D motion embedding. Besides, for motion compensation, we propose a 3D Adaptively Weighted Interpolation (3DAWI) algorithm with a penalty coefficient to adaptively decrease the impact of distant neighbours. We compress the motion embedding and the residual with a lossy autoencoder-based network. To our knowledge, this paper is the first work proposing an end-to-end deep dynamic point cloud compression framework. The experimental result shows that the proposed D-DPCC framework achieves an average 76% BD-Rate (Bjontegaard Delta Rate) gains against state-of-the-art Video-based Point Cloud Compression (V-PCC) v13 in inter mode.

  • Research Article
  • 10.1109/tip.2025.3648141
Implicit Neural Compression of Point Clouds.
  • Jan 1, 2026
  • IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
  • Hongning Ruan + 5 more

Point clouds have gained prominence across numerous applications due to their ability to accurately represent 3D objects and scenes. However, efficiently compressing unstructured, high-precision point cloud data remains a significant challenge. In this paper, we propose NeRC ${}^{\textbf {3}}$ , a novel point cloud compression framework that leverages implicit neural representations (INRs) to encode both geometry and attributes of dense point clouds. Our approach employs two coordinate-based neural networks: one maps spatial coordinates to voxel occupancy, while the other maps occupied voxels to their attributes, thereby implicitly representing the geometry and attributes of a voxelized point cloud. The encoder quantizes and compresses network parameters alongside auxiliary information required for reconstruction, while the decoder reconstructs the original point cloud by inputting voxel coordinates into the neural networks. Furthermore, we extend our method to dynamic point cloud compression through techniques that reduce temporal redundancy, including a 4D spatio-temporal representation termed 4D-NeRC ${}^{\textbf {3}}$ . Experimental results validate the effectiveness of our approach: For static point clouds, NeRC ${}^{\textbf {3}}$ outperforms octree-based G-PCC standard and existing INR-based methods. For dynamic point clouds, 4D-NeRC ${}^{\textbf {3}}$ achieves superior geometry compression performance compared to the latest G-PCC and V-PCC standards, while matching state-of-the-art learning-based methods. It also demonstrates competitive performance in joint geometry and attribute compression.

  • Conference Article
  • Cite Count Icon 15
  • 10.1109/icip42928.2021.9506333
Dynamic Point Cloud Geometry Compression using Cuboid based Commonality Modeling Framework
  • Sep 19, 2021
  • Ashek Ahmmed + 3 more

Point cloud in its uncompressed format require very high data rate for storage and transmission. The video based point cloud compression (V-PCC) technique projects a dynamic point cloud into geometry and texture video sequences. The projected geometry and texture video frames are then encoded using modern video coding standard like HEVC. However, HEVC encoder is unable to exploit the global commonality that exists within a geometry frame and between successive geometry frames to a greater extent. This is because in HEVC, the current frame partitioning starts from a rigid 64 × 64 pixels level without considering the structure of the scene need be coded. In this paper, an improved commonality modeling framework is proposed, by leveraging on cuboid-based frame partitioning, to encode point cloud geometry frames. The associated frame-partitioning scheme is based on statistical properties of the current geometry frame and therefore yields a flexible block partitioning structure composed of cuboids. Additionally, the proposed commonality modeling approach is computationally efficient and has a compact representation. Experimental results show that if the V-PCC reference encoder is augmented by the proposed commonality modeling technique, a bit rate savings of 2.71% and 4.25% are achieved for full body and upper body of human point clouds’ geometry sequences respectively. © 2021 IEEE.

  • Research Article
  • Cite Count Icon 23
  • 10.3390/s24103142
Advanced Patch-Based Affine Motion Estimation for Dynamic Point Cloud Geometry Compression
  • May 15, 2024
  • Sensors (Basel, Switzerland)
  • Yiting Shao + 3 more

The substantial data volume within dynamic point clouds representing three-dimensional moving entities necessitates advancements in compression techniques. Motion estimation (ME) is crucial for reducing point cloud temporal redundancy. Standard block-based ME schemes, which typically utilize the previously decoded point clouds as inter-reference frames, often yield inaccurate and translation-only estimates for dynamic point clouds. To overcome this limitation, we propose an advanced patch-based affine ME scheme for dynamic point cloud geometry compression. Our approach employs a forward-backward jointing ME strategy, generating affine motion-compensated frames for improved inter-geometry references. Before the forward ME process, point cloud motion analysis is conducted on previous frames to perceive motion characteristics. Then, a point cloud is segmented into deformable patches based on geometry correlation and motion coherence. During the forward ME process, affine motion models are introduced to depict the deformable patch motions from the reference to the current frame. Later, affine motion-compensated frames are exploited in the backward ME process to obtain refined motions for better coding performance. Experimental results demonstrate the superiority of our proposed scheme, achieving an average 6.28% geometry bitrate gain over the inter codec anchor. Additional results also validate the effectiveness of key modules within the proposed ME scheme.

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/iscas51556.2021.9401619
Visual Quality Optimization for View-Dependent Point Cloud Compression
  • May 1, 2021
  • Danying Wang + 4 more

The video-based point cloud compression (V-PCC) is the state-of-the-art dynamic point cloud compression technique. V-PCC projects the 3D point cloud data patch by patch to its bounding box and organizes projected patches into a video frame, making full use of the well-developed video coding tools. Despite its high efficiency, cracks easily exist in the reconstructed point cloud in various viewing angles, which seriously degrades the visual quality. In this paper, we propose an efficient method to improve the visual quality of dynamic point cloud, especially for the main view from the content provider. The relationship between patches and views is exploited, and an algorithm intelligently reserving points that may be discarded in V-PCC is proposed. According to our subjective and perceptual objective evaluation experiments, compared with V-PCC, the overall visual quality of the reconstructed point could is evidently improved. In particular, cracks are mended with our proposed method. The Bjontegaard delta bit-rate reduction of up to 3.1% is achieved with respect to Point Cloud Quality Metric (PCQM), which partially verifies the improvement of subjective quality when adopting the proposed method.

  • Research Article
  • Cite Count Icon 54
  • 10.1017/atsip.2018.15
Dynamic polygon clouds: representation and compression for VR/AR
  • Jan 1, 2018
  • APSIPA Transactions on Signal and Information Processing
  • Eduardo Pavez + 3 more

We introduce the polygon cloud, a compressible representation of three-dimensional geometry (including attributes, such as color), intermediate between polygonal meshes and point clouds. Dynamic polygon clouds, like dynamic polygonal meshes and dynamic point clouds, can take advantage of temporal redundancy for compression. In this paper, we propose methods for compressing both static and dynamic polygon clouds, specifically triangle clouds. We compare triangle clouds to both triangle meshes and point clouds in terms of compression, for live captured dynamic colored geometry. We find that triangle clouds can be compressed nearly as well as triangle meshes, while being more robust to noise and other structures typically found in live captures, which violate the assumption of a smooth surface manifold, such as lines, points, and ragged boundaries. We also find that triangle clouds can be used to compress point clouds with significantly better performance than previously demonstrated point cloud compression methods. For intra-frame coding of geometry, our method improves upon octree-based intra-frame coding by a factor of 5–10 in bit rate. Inter-frame coding improves this by another factor of 2–5. Overall, our proposed method improves over the previous state-of-the-art in dynamic point cloud compression by 33% or more.

  • Conference Article
  • Cite Count Icon 7
  • 10.1109/euvip.2018.8611760
Geometry-Guided 3D Data Interpolation for Projection-Based Dynamic Point Cloud Coding
  • Nov 1, 2018
  • Vida Fakour Sevom + 2 more

With the recent improvements in acquisition techniques for 3D media applications, it has become easier to collect 3D data, for example, dynamic point cloud data. Such point clouds consist of a large amount of 3D coordinates, which describe a scene or object in 3D space by its geometry and texture attributes. Moreover, they are an effective representation of 3D environments for applications such as Augmented Reality or Virtual Reality. One of the main problems for such data is that the number of points is typically too large to allow for real-time transmission or efficient storage. Thus, compressing such 3D data is a key issue to reduce the amount of required bandwidth or memory. This paper presents a method for efficient compression of dynamic point cloud data within the current MPEG standardization framework for dynamic point cloud compression. The key benefit of the presented work is the reduced number of encoded and decoded 3D points compared to the reference framework, thus encoding and decoding complexity is reduced significantly. Objective results show a speed-up of around 35-40% in coding times. Furthermore, reconstruction quality is preserved, thus reducing bit rate requirements by up to 30%. Visual results verify the improved reconstruction quality, and compared to the reference at the same computational complexity, coding efficiency is improved by over 40%.

  • Conference Article
  • Cite Count Icon 12
  • 10.1109/icassp39728.2021.9414171
Dynamic Point Cloud Compression Using A Cuboid Oriented Discrete Cosine Based Motion Model
  • Jun 6, 2021
  • Ashek Ahmmed + 3 more

Immersive media representation format based on point clouds has underpinned significant opportunities for extended reality applications. Point cloud in its uncompressed format require very high data rate for storage and transmission. The video based point cloud compression technique projects a dynamic point cloud into geometry and texture video sequences. The projected texture video is then coded using modern video coding standard like HEVC. Since the properties of projected texture video frames are different from traditional video frames, HEVC-based commonality modeling can be inefficient. An improved commonality modeling technique is proposed that employs discrete cosine basis oriented motion models and the domains of such models are approximated by homogeneous regions called cuboids. Experimental results show that the proposed commonality modeling technique can yield savings in bit rate of up to 4.17%.

Save Icon
Up Arrow
Open/Close