Double-Deep Learning-Based Point Cloud Geometry Coding with Adaptive Super-Resolution

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Point clouds represent 3D visual data in a very immersive and realistic way, offering to the users a large degree of navigation and interaction. For some key use cases, point clouds are potentially lighter and easier to acquire than other 3D representation models, thus offering an alternative with lower computational cost. To offer visual realistic and immersive experiences, notably the illusion of well-formed surfaces, point clouds typically require a large number of points. To make its storage and transmission feasible, efficient point cloud coding is essential. Recently, deep learning-based point cloud coding solutions have proven to be competitive in compression performance, excelling in distinct scenarios, although struggling to achieve similar results for sparser point clouds and lower coding rates. To tackle these limitations, this paper proposes a double-deep learning-based approach for point cloud coding by integrating a super-resolution tool. The main idea consists on converting sparser point clouds into denser ones via a down-sampling step performed before coding. Since this is a lossy process, a super-resolution step is included after decoding to mitigate the point losses and bringing the point cloud to the initial resolution. Furthermore, the sampling factor can be adaptively selected, thus offering additional flexibility to the point cloud characteristics. The proposed double-deep coding and super-resolution solution outperforms both the G-PCC Octree and V-PCC Intra point cloud coding standards achieving, respectively, 81.9% and 22.3% rate reduction measured as BD-Rate for the PSNR D1 metric.

Similar Papers
  • Research Article
  • Cite Count Icon 14
  • 10.1109/tmm.2023.3338081
Deep Learning-Based Point Cloud Coding and Super-Resolution: A Joint Geometry and Color Approach
  • Jan 1, 2025
  • IEEE Transactions on Multimedia
  • André F R Guarda + 5 more

In this golden age of multimedia, realistic content is in high demand with users seeking more immersive and interactive experiences. As a result, new image modalities for 3D representations have emerged in recent years, among which point clouds have deserved especial attention. Naturally, with this increase in demand, efficient storage and transmission became a must, with standardization groups such as MPEG and JPEG entering the scene, as it happened before with other types of visual media. In a surprising development, JPEG issued a Call for Proposals on point cloud coding targeting exclusively learning-based solutions, in parallel to a similar call for image coding. This is a natural consequence of the growing popularity of deep learning, which due to its excellent performances is currently dominant in the multimedia processing field, including coding. This paper presents the coding solution selected by JPEG as the best-performing response to the Call for Proposals and adopted as the first version of the JPEG Pleno Point Cloud Coding Verification Model, in practice the first step for developing a standard. The proposed solution offers a novel joint geometry and color approach for point cloud coding, in which a single deep learning model processes both geometry and color simultaneously. To maximize the RD performance for a large range of point clouds, the proposed solution uses down-sampling and learning-based super-resolution as pre- and post-processing steps. Compared to the MPEG point cloud coding standards, the proposed coding solution comfortably outperforms G-PCC, for both geometry, color, and joint quality metrics.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/ism55400.2022.00016
Impact of Conventional and Deep Learning-based Point Cloud Geometry Coding on Deep Learning-based Classification Performance
  • Dec 1, 2022
  • Abdelrahman Seleem + 3 more

Deep learning (DL)-based point cloud (PC) classification is a key computer vision task for many applications, notably autonomous driving, surveillance, and cultural heritage. In many application scenarios, PCs must be coded to reach practical rates for storage and transmission purposes, and thus they suffer from more or less intense compression artifacts. After the specification of two MPEG PC coding standards, DL-based PC coding has gained momentum, reaching competitive compression performance, especially for dense PCs. Since using decoded PCs, which may suffer from compression artifacts, may impact the final classification performance, the main goal of this paper is to study the impact of static PC geometry coding on DL-based classification. This study is performed on the ModelNet40 test dataset using the conventional G-PCC coding standard and the DL-based PC geometry codec which was the top performing solution responding to the recent JPEG Pleno PC Coding Call for Proposals. Two highly performing DL-based classifiers are used, considering the original PC geometry before and after voxelization, as well as the decoded PC geometry for different rates and qualities. As expected, coding has an impact on the classification performance, especially for the lower rates/qualities. For very sparse PCs, conventional coding still has advantage, contrarily to dense PCs, but this should change in the future with DL-based tools becoming the most natural solutions for both PC geometry coding and classification.

  • Conference Article
  • Cite Count Icon 21
  • 10.1109/mmsp48831.2020.9287060
Deep Learning-based Point Cloud Geometry Coding with Resolution Scalability
  • Sep 21, 2020
  • Andre F R Guarda + 2 more

Point clouds are a 3D visual representation format that has recently become fundamentally important for immersive and interactive multimedia applications. Considering the high number of points of practically relevant point clouds, and their increasing market demand, efficient point cloud coding has become a vital research topic. In addition, scalability is an important feature for point cloud coding, especially for real-time applications, where the fast and rate efficient access to a decoded point cloud is important; however, this issue is still rather unexplored in the literature. In this context, this paper proposes a novel deep learning-based point cloud geometry coding solution with resolution scalability via interlaced sub-sampling. As additional layers are decoded, the number of points in the reconstructed point cloud increases as well as the overall quality. Experimental results show that the proposed scalable point cloud geometry coding solution outperforms the recent MPEG Geometry-based Point Cloud Compression standard which is much less scalable.

  • Conference Article
  • Cite Count Icon 17
  • 10.1109/euvip47703.2019.8946211
Deep Learning-Based Point Cloud Coding: A Behavior and Performance Study
  • Oct 1, 2019
  • Andre F R Guarda + 2 more

Point clouds are an emerging 3D visual representation model for immersive and interactive multimedia applications, in particular for virtual and augmented reality. The huge amount of data associated to point clouds critically asks for efficient point cloud coding technology. While there are already some point cloud coding paradigms in the literature, notably octree, patch and graph-based for geometry data, very recently deep learning emerged in this research domain, offering very promising performances for image coding. While deep learning-based methods often provide interesting results, the understanding of this type of coding solutions is essential to improve their design in order to be used effectively. In this context, this paper presents a study and analysis on the behavior and performance of a deep learning-based point cloud coding solution based on an autoencoder network using only convolutional layers. Beside a promising RD performance, other findings should allow making solid steps in understanding this emerging coding paradigm.

  • Research Article
  • Cite Count Icon 5
  • 10.1109/access.2025.3549316
The JPEG Pleno Learning-Based Point Cloud Coding Standard: Serving Man and Machine
  • Jan 1, 2025
  • IEEE Access
  • André F R Guarda + 2 more

Efficient point cloud coding has become increasingly critical for multiple applications such as virtual reality, autonomous driving, and digital twin systems, where rich and interactive 3D data representations may functionally make the difference. Deep learning has emerged as a powerful tool in this domain, offering advanced techniques for compressing point clouds more efficiently than conventional coding methods while also allowing effective computer vision tasks performed in the compressed domain thus, for the first time, making available a common compressed visual representation effective for both man and machine. Taking advantage of this potential, JPEG has recently finalized the JPEG Pleno Learning-based Point Cloud Coding (PCC) standard offering efficient lossy coding of static point clouds, targeting both human visualization and machine processing by leveraging deep learning models for geometry and color coding. The geometry is processed directly in its original 3D form using sparse convolutional neural networks, while the color data is projected onto 2D images and encoded using the also learning-based JPEG AI standard. The goal of this paper is to provide a complete technical description of the JPEG PCC standard, along with a thorough benchmarking of its performance against the state-of-the-art, while highlighting its main strengths and weaknesses. In terms of compression performance, JPEG PCC outperforms the conventional MPEG PCC standards, especially in geometry coding, achieving significant rate reductions. Color compression performance is less competitive but this is overcome by the power of a full learning-based coding framework for both geometry and color and the associated effective compressed domain processing.

  • Research Article
  • Cite Count Icon 13
  • 10.1109/mmul.2020.3046691
Neighborhood Adaptive Loss Function for Deep Learning-Based Point Cloud Coding With Implicit and Explicit Quantization
  • Dec 22, 2020
  • IEEE MultiMedia
  • Andre F R Guarda + 2 more

As the interest in deep learning tools continues to rise, new multimedia research fields begin to discover its potential. Both image and point cloud coding are good examples of technologies, where deep learning-based solutions have recently displayed very competitive performance. In this context, this article brings two novel contributions to the point cloud geometry coding state-of-the-art; first, a novel neighborhood adaptive distortion metric to be used in the training loss function, which allows significantly improving the rate-distortion performance with commonly used objective quality metrics; second, an explicit quantization approach at the training and coding times to generate varying rate/quality with a single trained deep learning coding model, effectively reducing the training complexity and storage requirements. The result is an improved deep learning-based point cloud geometry coding solution, which is both more compression efficient and less demanding in training complexity and storage.

  • Conference Article
  • Cite Count Icon 64
  • 10.1109/pcs48520.2019.8954537
Point Cloud Coding: Adopting a Deep Learning-based Approach
  • Nov 1, 2019
  • André F R Guarda + 2 more

Point clouds have recently become an important visual representation format, especially for virtual and augmented reality applications, thus making point cloud coding a very hot research topic. Deep learning-based coding methods have recently emerged in the field of image coding with increasing success. These coding solutions take advantage of the ability of convolutional neural networks to extract adaptive features from the images to create a latent representation that can be efficiently coded. In this context, this paper extends the deep-learning coding approach to point cloud coding using an autoencoder network design. Performance results are very promising, showing improvements over the Point Cloud Library codec often taken as benchmark, thus suggesting a significant margin of evolution for this new point cloud coding paradigm.

  • Research Article
  • Cite Count Icon 105
  • 10.1109/jstsp.2020.3047520
Adaptive Deep Learning-Based Point Cloud Geometry Coding
  • Dec 25, 2020
  • IEEE Journal of Selected Topics in Signal Processing
  • Andre F R Guarda + 2 more

Point clouds are a very rich 3D visual representation model, which has become increasingly appealing for multimedia applications with immersion, interaction and realism requirements. Due to different acquisition and creation conditions as well as target applications, point clouds' characteristics may be very diverse, notably on their density. While geographical information systems or autonomous driving applications may use rather sparse point clouds, cultural heritage or virtual reality applications typically use denser point clouds to more accurately represent objects and people. Naturally, to offer immersion and realism, point clouds need a rather large number of points, thus asking for the development of efficient coding solutions. The use of deep learning models for coding purposes has recently gained relevance, with latest developments in image coding achieving state-of-the-art performance, thus making natural the adoption of this technology also for point cloud coding. This paper presents a novel deep learning-based solution for point cloud geometry coding which is able to efficiently adapt to the content's characteristics. The proposed coding solution divides the point cloud into 3D blocks and selects the most suitable available deep learning coding model to code each block, thus maximizing the compression performance. In comparison to the state-of-the-art MPEG G-PCC Trisoup standard, the proposed coding solution offers average quality gains up to 4.9 and 5.7 dB for PSNR D1 and PSNR D2, respectively.

  • Conference Article
  • Cite Count Icon 11
  • 10.1109/mmsp.2019.8901690
Improved Patch Packing for the MPEG V-PCC Standard
  • Sep 1, 2019
  • Afonso Costa + 4 more

Point cloud representation is an emerging visual data technology, targeting immersive 3D experiences in the context of multiple applications scenarios, notably entertainment, geographical information systems, medicine, architecture, and robotics. Since a point cloud may easily involve millions of points, and thus an enormous amount of data, its effective storage and transmission critically asks for efficient coding solutions. With this purpose in mind, several point cloud coding (PCC) solutions have been proposed in the literature; special emphasis is due to the recent MPEG standards, which target interoperability in this domain, notably the MPEG V-PCC standard. The objective of this paper is to improve the V-PCC standard compression efficiency by proposing novel solutions for the V-PCC packing module without compromising in any way the V-PCC stream (syntax and semantics) and decoder compliance. In this context, several patch packing solutions are proposed, including new packing algorithms and associated sorting and positioning metrics; for the metrics, both absolute and relative approaches are proposed. The RD performance results show BD-Rate savings up to 0.8% for the best packing solution regarding the V-PCC benchmark. Moreover, the packing map size reductions can go up to 12%, on average.

  • Research Article
  • Cite Count Icon 21
  • 10.1016/j.image.2020.115862
Point cloud coding: A privileged view driven by a classification taxonomy
  • Apr 21, 2020
  • Signal Processing: Image Communication
  • Fernando Pereira + 3 more

Point cloud coding: A privileged view driven by a classification taxonomy

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/mmsp.2017.8122287
Improving point cloud to surface reconstruction with generalized Tikhonov regularization
  • Oct 1, 2017
  • Andre F R Guarda + 3 more

Point cloud rendering has a vital role in the user Quality of Experience for applications adopting point cloud based representations. While this is not a new area, it has recently become more relevant with the recent interest on point cloud coding by major standardization groups, notably JPEG and MPEG. The screened Poisson surface reconstruction is a state-of-the-art technique for generating a watertight surface mesh from the point cloud samples. While its screening component allows the surface to better fit the cloud points, this fitting may lead to undesired artifacts in the surface, notably when the point cloud is noisy. This paper proposes to improve this reconstruction method by making it more robust to noise by adopting a generalized Tikhonov regularization term. The proposed regularization approach smooths regions that should be flat while keeping the important details in the edges, thus creating more pleasant surface reconstructions.

  • Conference Article
  • Cite Count Icon 22
  • 10.1109/mmsp.2019.8901791
Adaptive Multi-level Triangle Soup for Geometry-based Point Cloud Coding
  • Sep 1, 2019
  • Antoine Dricot + 1 more

Nowadays, point clouds are considered as a promising representation for future immersive 3D applications. However, to recreate a 3D object or scene with high fidelity, a large number of points is required, often with high coordinates precision. Thus, efficient compression schemes are much needed for transmission and storage of point cloud data. Several types of coding techniques (e.g. 2D mapping, graph transforms, etc.) have been proposed in the literature to solve this problem. Octree-based solutions provide compression efficiency and level-of-detail scalability, especially for large static point clouds. The upcoming MPEG Geometry based Point Cloud Coding (G-PCC) solution provides high coding performance and relies on a static pruned octree, combined with a trisoup (for triangle soup) surface reconstruction. In this paper, the geometry coding process of G-PCC is enhanced by introducing mode decisions, thus enabling the trisoup in leaf nodes at multiple levels of the octree, i.e. allowing a more adaptive octree partitioning. Experimental results report average BD-rate gains of 5.3% (up to 7%) for the geometry component (point-to-plane error) on dense point clouds, with no significant impact on the color coding performance.

  • Research Article
  • Cite Count Icon 2
  • 10.1109/access.2025.3551073
A Double Deep Learning-Based Solution for Efficient Event Data Coding and Classification
  • Jan 1, 2025
  • IEEE Access
  • Abdelrahman Seleem + 3 more

Event cameras have the ability to capture asynchronous per-pixel brightness changes, usually called “events”, offering advantages over traditional frame-based cameras for computer vision tasks. Efficiently coding event data is critical for practical transmission and storage, given the very significant number of events captured. This paper proposes a novel double deep learning-based solution for efficient event data coding and classification, using a point cloud-based representation for events. Moreover, since the conversions from events to point clouds and back to events are key steps in the proposed solution, novel tools are proposed and their impact is evaluated in terms of compression and classification performance. Experimental results show that it is possible to achieve a classification performance for decompressed events which is similar to the one for original events, even after applying a lossy point cloud codec, notably the recent deep learning-based JPEG Pleno Point Cloud Coding standard, with a clear rate reduction. Experimental results also demonstrate that events coded using the JPEG standard achieve better classification performance than those coded using the conventional lossy MPEG Geometry-based Point Cloud Coding standard for the same rate. Furthermore, the adoption of deep learning-based coding offers future high potential for performing computer vision tasks in the compressed domain, which allows skipping the decoding stage, thus mitigating the impact of compression artifacts.

  • Conference Article
  • Cite Count Icon 21
  • 10.1109/icip40778.2020.9191021
Point Cloud Geometry Scalable Coding With a Single End-to-End Deep Learning Model
  • Oct 1, 2020
  • Andre F R Guarda + 2 more

Point clouds are gaining importance as the format to represent complex 3D objects and scenes, offering high user immersion and interaction, although at the cost of requiring massive data. Scalable coding is an important feature for point cloud coding, especially for real-time applications, where the fast and bitrate efficient access to a decoded point cloud is important; however, this issue is still rather unexplored in the literature. With the rise of deep learning methods as a promising solution for efficient coding, this paper proposes the first deep learning-based point cloud geometry scalable coding solution. Experimental results show that the proposed scalable coding solution consistently outperforms the MPEG standard for static point cloud geometry coding. In this way, a new research path is open for point cloud scalable coding technology.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tmm.2025.3565958
Hierarchical Distortion Learning for Fast Lossy Compression of Point Clouds
  • Jan 1, 2025
  • IEEE Transactions on Multimedia
  • Pengpeng Yu + 4 more

The growth of 3D point cloud applications requires efficient compression techniques for high-quality and low-latency services. Recently, learning-based point cloud compression models have made significant progress. However, geometric distortion resulting from downsampling limits the feature depth within large-scale point clouds, thereby constraining the receptive field and suppressing the redundant removal. Moreover, the issues of computational efficiency and reconstruction quality still persist in the compression of large-scale point clouds. To address these challenges, we propose a hierarchical distortion learning framework for end-to-end lossy compression of point clouds. First, we design a feature residual compression module to efficiently transmit shallow semantics between the encoder and the decoder, which enables a lightweight design of our framework. Second, we introduce a geometry residual compression module to progressively complement the reconstruction distortion, avoiding the accumulation of geometric distortion. By integrating these two modules and employing sufficient downsampling processes, we develop a high-performance framework with a significantly enlarged receptive field and low computational cost. Extensive experiments demonstrate that our method achieves state-ofthe- art performance in geometry lossy compression, while delivering competitive performance in joint geometry and color lossy compression with fast running speed. Code is available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/pengpeng-yu/FastPCC</uri>.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant