Deep Learning-based Point Cloud Geometry Coding with Resolution Scalability

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Point clouds are a 3D visual representation format that has recently become fundamentally important for immersive and interactive multimedia applications. Considering the high number of points of practically relevant point clouds, and their increasing market demand, efficient point cloud coding has become a vital research topic. In addition, scalability is an important feature for point cloud coding, especially for real-time applications, where the fast and rate efficient access to a decoded point cloud is important; however, this issue is still rather unexplored in the literature. In this context, this paper proposes a novel deep learning-based point cloud geometry coding solution with resolution scalability via interlaced sub-sampling. As additional layers are decoded, the number of points in the reconstructed point cloud increases as well as the overall quality. Experimental results show that the proposed scalable point cloud geometry coding solution outperforms the recent MPEG Geometry-based Point Cloud Compression standard which is much less scalable.

Similar Papers
  • Conference Article
  • Cite Count Icon 17
  • 10.1109/euvip47703.2019.8946211
Deep Learning-Based Point Cloud Coding: A Behavior and Performance Study
  • Oct 1, 2019
  • Andre F R Guarda + 2 more

Point clouds are an emerging 3D visual representation model for immersive and interactive multimedia applications, in particular for virtual and augmented reality. The huge amount of data associated to point clouds critically asks for efficient point cloud coding technology. While there are already some point cloud coding paradigms in the literature, notably octree, patch and graph-based for geometry data, very recently deep learning emerged in this research domain, offering very promising performances for image coding. While deep learning-based methods often provide interesting results, the understanding of this type of coding solutions is essential to improve their design in order to be used effectively. In this context, this paper presents a study and analysis on the behavior and performance of a deep learning-based point cloud coding solution based on an autoencoder network using only convolutional layers. Beside a promising RD performance, other findings should allow making solid steps in understanding this emerging coding paradigm.

  • Conference Article
  • Cite Count Icon 7
  • 10.1109/euvip53989.2022.9922784
Double-Deep Learning-Based Point Cloud Geometry Coding with Adaptive Super-Resolution
  • Sep 11, 2022
  • Manuel Ruivo + 2 more

Point clouds represent 3D visual data in a very immersive and realistic way, offering to the users a large degree of navigation and interaction. For some key use cases, point clouds are potentially lighter and easier to acquire than other 3D representation models, thus offering an alternative with lower computational cost. To offer visual realistic and immersive experiences, notably the illusion of well-formed surfaces, point clouds typically require a large number of points. To make its storage and transmission feasible, efficient point cloud coding is essential. Recently, deep learning-based point cloud coding solutions have proven to be competitive in compression performance, excelling in distinct scenarios, although struggling to achieve similar results for sparser point clouds and lower coding rates. To tackle these limitations, this paper proposes a double-deep learning-based approach for point cloud coding by integrating a super-resolution tool. The main idea consists on converting sparser point clouds into denser ones via a down-sampling step performed before coding. Since this is a lossy process, a super-resolution step is included after decoding to mitigate the point losses and bringing the point cloud to the initial resolution. Furthermore, the sampling factor can be adaptively selected, thus offering additional flexibility to the point cloud characteristics. The proposed double-deep coding and super-resolution solution outperforms both the G-PCC Octree and V-PCC Intra point cloud coding standards achieving, respectively, 81.9% and 22.3% rate reduction measured as BD-Rate for the PSNR D1 metric.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/ism55400.2022.00016
Impact of Conventional and Deep Learning-based Point Cloud Geometry Coding on Deep Learning-based Classification Performance
  • Dec 1, 2022
  • Abdelrahman Seleem + 3 more

Deep learning (DL)-based point cloud (PC) classification is a key computer vision task for many applications, notably autonomous driving, surveillance, and cultural heritage. In many application scenarios, PCs must be coded to reach practical rates for storage and transmission purposes, and thus they suffer from more or less intense compression artifacts. After the specification of two MPEG PC coding standards, DL-based PC coding has gained momentum, reaching competitive compression performance, especially for dense PCs. Since using decoded PCs, which may suffer from compression artifacts, may impact the final classification performance, the main goal of this paper is to study the impact of static PC geometry coding on DL-based classification. This study is performed on the ModelNet40 test dataset using the conventional G-PCC coding standard and the DL-based PC geometry codec which was the top performing solution responding to the recent JPEG Pleno PC Coding Call for Proposals. Two highly performing DL-based classifiers are used, considering the original PC geometry before and after voxelization, as well as the decoded PC geometry for different rates and qualities. As expected, coding has an impact on the classification performance, especially for the lower rates/qualities. For very sparse PCs, conventional coding still has advantage, contrarily to dense PCs, but this should change in the future with DL-based tools becoming the most natural solutions for both PC geometry coding and classification.

  • Research Article
  • Cite Count Icon 14
  • 10.1109/tmm.2023.3338081
Deep Learning-Based Point Cloud Coding and Super-Resolution: A Joint Geometry and Color Approach
  • Jan 1, 2025
  • IEEE Transactions on Multimedia
  • André F R Guarda + 5 more

In this golden age of multimedia, realistic content is in high demand with users seeking more immersive and interactive experiences. As a result, new image modalities for 3D representations have emerged in recent years, among which point clouds have deserved especial attention. Naturally, with this increase in demand, efficient storage and transmission became a must, with standardization groups such as MPEG and JPEG entering the scene, as it happened before with other types of visual media. In a surprising development, JPEG issued a Call for Proposals on point cloud coding targeting exclusively learning-based solutions, in parallel to a similar call for image coding. This is a natural consequence of the growing popularity of deep learning, which due to its excellent performances is currently dominant in the multimedia processing field, including coding. This paper presents the coding solution selected by JPEG as the best-performing response to the Call for Proposals and adopted as the first version of the JPEG Pleno Point Cloud Coding Verification Model, in practice the first step for developing a standard. The proposed solution offers a novel joint geometry and color approach for point cloud coding, in which a single deep learning model processes both geometry and color simultaneously. To maximize the RD performance for a large range of point clouds, the proposed solution uses down-sampling and learning-based super-resolution as pre- and post-processing steps. Compared to the MPEG point cloud coding standards, the proposed coding solution comfortably outperforms G-PCC, for both geometry, color, and joint quality metrics.

  • Research Article
  • Cite Count Icon 13
  • 10.1109/mmul.2020.3046691
Neighborhood Adaptive Loss Function for Deep Learning-Based Point Cloud Coding With Implicit and Explicit Quantization
  • Dec 22, 2020
  • IEEE MultiMedia
  • Andre F R Guarda + 2 more

As the interest in deep learning tools continues to rise, new multimedia research fields begin to discover its potential. Both image and point cloud coding are good examples of technologies, where deep learning-based solutions have recently displayed very competitive performance. In this context, this article brings two novel contributions to the point cloud geometry coding state-of-the-art; first, a novel neighborhood adaptive distortion metric to be used in the training loss function, which allows significantly improving the rate-distortion performance with commonly used objective quality metrics; second, an explicit quantization approach at the training and coding times to generate varying rate/quality with a single trained deep learning coding model, effectively reducing the training complexity and storage requirements. The result is an improved deep learning-based point cloud geometry coding solution, which is both more compression efficient and less demanding in training complexity and storage.

  • Conference Article
  • Cite Count Icon 21
  • 10.1109/icip40778.2020.9191021
Point Cloud Geometry Scalable Coding With a Single End-to-End Deep Learning Model
  • Oct 1, 2020
  • Andre F R Guarda + 2 more

Point clouds are gaining importance as the format to represent complex 3D objects and scenes, offering high user immersion and interaction, although at the cost of requiring massive data. Scalable coding is an important feature for point cloud coding, especially for real-time applications, where the fast and bitrate efficient access to a decoded point cloud is important; however, this issue is still rather unexplored in the literature. With the rise of deep learning methods as a promising solution for efficient coding, this paper proposes the first deep learning-based point cloud geometry scalable coding solution. Experimental results show that the proposed scalable coding solution consistently outperforms the MPEG standard for static point cloud geometry coding. In this way, a new research path is open for point cloud scalable coding technology.

  • Research Article
  • Cite Count Icon 105
  • 10.1109/jstsp.2020.3047520
Adaptive Deep Learning-Based Point Cloud Geometry Coding
  • Dec 25, 2020
  • IEEE Journal of Selected Topics in Signal Processing
  • Andre F R Guarda + 2 more

Point clouds are a very rich 3D visual representation model, which has become increasingly appealing for multimedia applications with immersion, interaction and realism requirements. Due to different acquisition and creation conditions as well as target applications, point clouds' characteristics may be very diverse, notably on their density. While geographical information systems or autonomous driving applications may use rather sparse point clouds, cultural heritage or virtual reality applications typically use denser point clouds to more accurately represent objects and people. Naturally, to offer immersion and realism, point clouds need a rather large number of points, thus asking for the development of efficient coding solutions. The use of deep learning models for coding purposes has recently gained relevance, with latest developments in image coding achieving state-of-the-art performance, thus making natural the adoption of this technology also for point cloud coding. This paper presents a novel deep learning-based solution for point cloud geometry coding which is able to efficiently adapt to the content's characteristics. The proposed coding solution divides the point cloud into 3D blocks and selects the most suitable available deep learning coding model to code each block, thus maximizing the compression performance. In comparison to the state-of-the-art MPEG G-PCC Trisoup standard, the proposed coding solution offers average quality gains up to 4.9 and 5.7 dB for PSNR D1 and PSNR D2, respectively.

  • Research Article
  • Cite Count Icon 5
  • 10.1109/access.2025.3549316
The JPEG Pleno Learning-Based Point Cloud Coding Standard: Serving Man and Machine
  • Jan 1, 2025
  • IEEE Access
  • André F R Guarda + 2 more

Efficient point cloud coding has become increasingly critical for multiple applications such as virtual reality, autonomous driving, and digital twin systems, where rich and interactive 3D data representations may functionally make the difference. Deep learning has emerged as a powerful tool in this domain, offering advanced techniques for compressing point clouds more efficiently than conventional coding methods while also allowing effective computer vision tasks performed in the compressed domain thus, for the first time, making available a common compressed visual representation effective for both man and machine. Taking advantage of this potential, JPEG has recently finalized the JPEG Pleno Learning-based Point Cloud Coding (PCC) standard offering efficient lossy coding of static point clouds, targeting both human visualization and machine processing by leveraging deep learning models for geometry and color coding. The geometry is processed directly in its original 3D form using sparse convolutional neural networks, while the color data is projected onto 2D images and encoded using the also learning-based JPEG AI standard. The goal of this paper is to provide a complete technical description of the JPEG PCC standard, along with a thorough benchmarking of its performance against the state-of-the-art, while highlighting its main strengths and weaknesses. In terms of compression performance, JPEG PCC outperforms the conventional MPEG PCC standards, especially in geometry coding, achieving significant rate reductions. Color compression performance is less competitive but this is overcome by the power of a full learning-based coding framework for both geometry and color and the associated effective compressed domain processing.

  • Conference Article
  • Cite Count Icon 64
  • 10.1109/pcs48520.2019.8954537
Point Cloud Coding: Adopting a Deep Learning-based Approach
  • Nov 1, 2019
  • André F R Guarda + 2 more

Point clouds have recently become an important visual representation format, especially for virtual and augmented reality applications, thus making point cloud coding a very hot research topic. Deep learning-based coding methods have recently emerged in the field of image coding with increasing success. These coding solutions take advantage of the ability of convolutional neural networks to extract adaptive features from the images to create a latent representation that can be efficiently coded. In this context, this paper extends the deep-learning coding approach to point cloud coding using an autoencoder network design. Performance results are very promising, showing improvements over the Point Cloud Library codec often taken as benchmark, thus suggesting a significant margin of evolution for this new point cloud coding paradigm.

  • Research Article
  • 10.1088/2631-8695/ae575c
A deep learning point cloud model for cutting tool prediction using fourier geometric feature enhancement and adaptive sampling
  • Apr 1, 2026
  • Engineering Research Express
  • Lifeng Yin + 3 more

Future data-driven computer-aided tool selection systems require the capability to autonomously learn from both workpiece information and quality data. To address this need, this study proposes a deep learning-based point cloud analysis method for tool selection in machining. Traditional sampling approaches often fail to accurately extract informative point cloud data, particularly when distinguishing between rough and finish-machined workpieces—such as drilling components—with only subtle geometric differences. The proposed model efficiently acquires valuable point cloud samples during preprocessing and jointly learns both local and global geometric features of workpieces. To overcome the loss of critical geometric information inherent in conventional farthest point sampling, a differentiated sampling strategy is introduced to better capture edge and cutting-surface features. Furthermore, Fourier transform-based frequency domain analysis is employed to enhance the model’s ability to represent multi-scale geometric structures. Finally, a dual attention mechanism is developed to effectively integrate multi-modal features for more robust point cloud classification. Experimental results on the IWD dataset demonstrate that the proposed method achieves an accuracy of 98.89\%, outperforming fifteen state-of-the-art baseline models.

  • Research Article
  • Cite Count Icon 23
  • 10.1016/j.autcon.2024.105473
Automated BIM-to-scan point cloud semantic segmentation using a domain adaptation network with hybrid attention and whitening (DawNet)
  • May 18, 2024
  • Automation in Construction
  • Difeng Hu + 2 more

Automated BIM-to-scan point cloud semantic segmentation using a domain adaptation network with hybrid attention and whitening (DawNet)

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/agro-geoinformatics50104.2021.9530316
Sementing the Field of Rapeseed from 3D Laser Point Cloud Using Deep Learning
  • Jul 26, 2021
  • Fangzheng Hu + 3 more

Wisdom agriculture is a significant stage goal in the process of agricultural modernization development. Wisdom agriculture promotes the integration of agricultural informatization and intelligence. In recent years, the new models of intelligent agriculture based on artificial intelligence has developed rapidly. In this paper, 3D laser point cloud is used as research data to carry out in-depth research in the field of agriculture based on deep learning technology and point cloud. In this study, the deep learning model Pointnet ++ was used to segment the rapeseed point cloud data in the field: (1) The color enhancement algorithm of HSV color space was used to achieve color threshold segmentation of rapeseed crop point cloud data in complex field environment, and Statistical Outlier Filter and Super-Voxel Clustering were used to segment group rapeseed point cloud respectively. Finally, two groups of pure rapeseed point cloud data were obtained. (2) In this research, six original rapeseed point cloud data sets were used as datasets to train and test the segmentation performance of Pointnet++ (Multi-scale Grouping, MSG) deep learning model for rapeseed point cloud. Intersection over Union(IoU) was taken as the evaluation index of point cloud segmentation accuracy. The IoU of rape point cloud data processed by the three segmentation methods were 0.7748, 0.8019 and 0.8260, respectively. The results show that the segmentation performance of the deep learning model based on Pointnet ++ (MSG) is higher than that of the conventional point cloud segmentation algorithm. Compared with the conventional point cloud segmentation models, the point cloud segmentation based on deep learning framework shows better performance. The construction of a deep learning framework for crop point cloud segmentation and classification in the field requires the corresponding feature extraction processing based on the geometric structure or attributes of specific crops. In the context of the rapid development of agricultural big data, the deep learning framework in the field of agriculture is robust to deal with complex field environment, and the application of deep learning to agricultural research has a good prospect.

  • Research Article
  • Cite Count Icon 33
  • 10.1016/j.neucom.2020.08.030
Automatic cardiac MRI segmentation and permutation-invariant pathology classification using deep neural networks and point clouds
  • Aug 29, 2020
  • Neurocomputing
  • Yakun Chang + 1 more

Automatic cardiac MRI segmentation and permutation-invariant pathology classification using deep neural networks and point clouds

  • Research Article
  • Cite Count Icon 26
  • 10.1016/j.inffus.2024.102305
Deep learning-based low overlap point cloud registration for complex scenario: The review
  • Feb 16, 2024
  • Information Fusion
  • Yuehua Zhao + 3 more

Deep learning-based low overlap point cloud registration for complex scenario: The review

  • Research Article
  • Cite Count Icon 1
  • 10.3390/electronics11193157
Sparse 3D Point Cloud Parallel Multi-Scale Feature Extraction and Dense Reconstruction with Multi-Headed Attentional Upsampling
  • Oct 1, 2022
  • Electronics
  • Meng Wu + 2 more

Three-dimensional (3D) point clouds have a wide range of applications in the field of 3D vision. The quality of the acquired point cloud data considerably impacts the subsequent work of point cloud processing. Due to the sparsity and irregularity of point cloud data, processing point cloud data has always been challenging. However, existing deep learning-based point cloud dense reconstruction methods suffer from excessive smoothing of reconstruction results and too many outliers. The reason for this is that it is not possible to extract features for local and global features at different scales and provide different levels of attention to different regions in order to obtain long-distance dependence for dense reconstruction. In this paper, we use a parallel multi-scale feature extraction module based on graph convolution and an upsampling method with an added multi-head attention mechanism to process sparse and irregular point cloud data to obtain extended point clouds. Specifically, a point cloud training patch with 256 points is inputted. The PMS module uses three residual connections in the multi-scale feature extraction stage. Each PMS module consists of three parallel DenseGCN modules with different size convolution kernels and different averaging pooling sizes. The local and global feature information of the augmented receptive field is extracted efficiently. The scale information is obtained by averaging the different pooled augmented receptive fields. The scale information was obtained using the different average pooled augmented receptive fields. The upsampling stage uses an upsampling rate of r=4, The self-attentive features with a different focus on different point cloud data regions obtained by fusing different weights make the feature representation more diverse. This operation avoids the bias of one attention, and each focuses on extracting valuable fine-grained feature information. Finally, the coordinate reconstruction module obtains 1024 dense point cloud data. Experiments show that the proposed method demonstrates good evaluation metrics and performance and is able to obtain better visual quality. The problems of over-smoothing and excessive outliers are effectively mitigated, and the obtained sparse point cloud is more dense.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant