Rate-distortion-optimized predictive compression of dynamic 3D mesh sequences
Rate-distortion-optimized predictive compression of dynamic 3D mesh sequences
- Conference Article
1
- 10.1109/3dtv.2008.4547793
- May 1, 2008
3D triangle meshes are a common form for representing the geometry of static and dynamic 3D objects. They are employed already in many areas, e.g. e-commerce, video games, online museums, CGI or 3D animated films, etc. Static triangle meshes represent only a piecewise linear approximation of complex 3D objects. As a consequence the approximation error can be unacceptably high unless the number of triangles is sufficiently large. On the other hand a large number of triangles makes these meshes cumbersome to handle and expensive to store or to transmit. Consequently, there exists a demand for techniques for efficient compression of static and dynamic 3D meshes. In this article we start with basics on 3D meshes. Thereafter, we explain the key ideas behind different mesh compression approaches for static and dynamic 3D meshes, and highlight their similarities and differences. Finally, we introduce the upcoming MPEG standard for compression of dynamic 3D meshes, which is referred to as FAMC (Frame-based Animated Mesh Compression), and show comparative compression results.
- Conference Article
15
- 10.1109/icip.2006.312394
- Oct 1, 2006
Recent developments in the compression of dynamic meshes or mesh sequences have shown that the statistical dependencies within a mesh sequence can be exploited well by predictive coding approaches. Coders introduced so far use experimentally determined or heuristic thresholds for tuning the algorithms. In video coding rate-distortion (RD) optimization is often used to avoid fixing of thresholds and to select a coding mode. We applied these ideas and present here an RD-optimized mesh coder. It includes different prediction modes as well as an RD cost computation that controls the mode selection across all possible spatial partitions of a mesh to find the clustering structure together with the associated prediction modes. The structure of the RD-optimized D3DMC coder is presented, followed by comparative results with mesh sequences at different resolutions.
- Conference Article
3
- 10.1109/iccc51575.2020.9345051
- Dec 11, 2020
This study proposes a LCU (Large Coding Unit) level fast algorithm of video coding based on AVS3. Nowadays, AVS3 (Audio Video coding Standard 3) has implemented traversal algorithm on CU (Coding Unit) prediction modes. It first computes the current blocks RD cost of inter-frames mode and then compute the RD (Rate Distortion) cost of intra-frames mode. By comparing RD cost, it chooses the best prediction mode. Inter-frames modes include three modes: skip, direct and inter. While intra-frames modes include two: intra and IBC (Intra Block Copy). This method will run different prediction modes algorithm when it encodes one CU and can choose the optimal prediction mode, though with large computation complexity. Based on previous researches, we do more surveys and expand the LCU of AVS3 from 64*64 to 128*128 and add more prediction tools. Therefore, the prediction mode in space express stronger correlation and we present a fast prediction mode algorithm based on apposition LCU. According to apposition LCU image content of adjacent frames in the same video and correlation with optimal prediction mode, we can provide prediction mode reference for the LCU not encoded and reduce the number of traversals. The time complexity is decreased significantly. In our experiment, it shows that under the LDP (Low delay P-frame) model, twelve general test cases reduce 16.67% in time complexity and the performance loss is only 0.61%. With fewer performance loss, we dramatically reduce time complexity and the experiments results is overall satisfying.
- Book Chapter
- 10.4018/978-1-61692-831-5.ch005
- Jan 1, 2011
Application of 3D mesh model coding is first presented in this chapter. We then survey the typical existing algorithms in the area of compression of static and dynamic 3D meshes. In an introductory sub-section we introduce basic concepts of 3D mesh models, including data representations, model formats, data acquisitions and 3D display technologies. Furthermore, we introduce several typical 3D mesh formats and give an overview to coding principles of mesh compression algorithms in general, followed by describing the quantitative measures for 3D mesh compression. Then we describe some typical and state-of-the-art algorithms in 3D mesh compression. Compression and streaming of gigantic 3D models are specially introduced. At last, the MPEG4 3D mesh model coding standard is briefed. We conclude this chapter with a discussion providing an overall picture of developments in the mesh coding area and pointing out directions for future research.
- Book Chapter
7
- 10.5772/14608
- Apr 26, 2011
H.264/AVC is the latest international video coding standard developed by ITU-T Video Coding Expert Group and the ISO/IEC Moving Picture Expert Group, which provides gains in compression efficiency of about 40% compared to previous standards (ISO/IEC 14496-10, 2004, Weigand et al., 2003). New and advanced techniques are introduced in this new standard, such as intra prediction for I-frame encoding, multi-frame inter prediction, small block-size transform coding, context-adaptive binary arithmetic coding (CABAC), deblocking filtering, etc. These advanced techniques offer approximately 40% bit rate saving for comparable perceptual quality relative to the performance of prior standards (Weigand ct al., 2003). H.264 intra prediction offers nine prediction modes for 4x4 luma blocks, nine prediction modes for 8x8 luma blocks and four prediction modes for 16 x 16 luma blocks. However, the rate-distortion (RD) performance of the intra frame coding is still lower than that of inter frame coding. Hence intra frame coding usually requires much larger bits than inter frame coding which results in buffer control difficulties and/or dropping of several frames after the intra frames in real-time video. Thus the development of an efficient intra coding technique is an important task for overall bit rate reduction and efficient streaming. H.264/AVC uses rate-distortion optimization (RDO) technique to get the best coding mode out of nine prediction modes in terms of maximizing coding quality and minimizing bit rates. This means that the encoder has to code the video by exhaustively trying all of the nine mode combinations. The best mode is the one having the minimum rate-distortion (RD) cost. In order to compute RD cost for each mode, the same operation of forward and inverse transform/quantization and entropy coding is repetitively performed. All of these processing explains the high complexity of RD cost calculation. Therefore, computational complexity of encoder is increased drastically. Using nine prediction modes in intra 4x4 and 8x8 block unit for a 16x16 macroblock (MB) can reduce spatial redundancies, but it may needs a lot of overhead bits to represent the prediction mode of each 4x4 and 8x8 block. Fast intra mode decision algorithms were proposed to reduce the number of modes that needed calculation according to some criteria (Sarwer et al.,2008, Tsai et al., 2008, Kim, 2008, Pan et al., 2005, Yang et al., 2004). An intra mode bits skip (IBS) method based on adaptive singlemultiple prediction is proposed in order to reduce not only the overhead mode bits but also computational cost of the encoder (Kim et al., 2010). If the neighbouring pixels of upper and left blocks are similar, only DC prediction is used and it does not need prediction mode bits or else nine prediction modes are computed. But the IBS method suffers with some drawbacks a) the reference pixels in up-right block are not considered for similarity
- Research Article
3
- 10.3390/jimaging6060055
- Jun 26, 2020
- Journal of Imaging
Recently, spectral methods have been extensively used in the processing of 3D meshes. They usually take advantage of some unique properties that the eigenvalues and the eigenvectors of the decomposed Laplacian matrix have. However, despite their superior behavior and performance, they suffer from computational complexity, especially while the number of vertices of the model increases. In this work, we suggest the use of a fast and efficient spectral processing approach applied to dense static and dynamic 3D meshes, which can be ideally suited for real-time denoising and compression applications. To increase the computational efficiency of the method, we exploit potential spectral coherence between adjacent parts of a mesh and then we apply an orthogonal iteration approach for the tracking of the graph Laplacian eigenspaces. Additionally, we present a dynamic version that automatically identifies the optimal subspace size that satisfies a given reconstruction quality threshold. In this way, we overcome the problem of the perceptual distortions, due to the fixed number of subspace sizes that is used for all the separated parts individually. Extensive simulations carried out using different 3D models in different use cases (i.e., compression and denoising), showed that the proposed approach is very fast, especially in comparison with the SVD based spectral processing approaches, while at the same time the quality of the reconstructed models is of similar or even better reconstruction quality. The experimental analysis also showed that the proposed approach could also be used by other denoising methods as a preprocessing step, in order to optimize the reconstruction quality of their results and decrease their computational complexity since they need fewer iterations to converge.
- Research Article
182
- 10.1109/tcsvt.2014.2313892
- May 27, 2014
- IEEE Transactions on Circuits and Systems for Video Technology
High Efficiency Video Coding (HEVC) adopts the quadtree structured coding unit (CU), which allows recursive splitting into four equally sized blocks. At each depth level, it enables SKIP mode, merge mode, inter 2N × 2N, inter 2N × N, inter N × 2N, inter 2N × nU, inter 2N × nD, inter nL x 2N, inter nR × 2N, inter N × N (only available for the smallest CU), intra 2N × 2N, and intra N × N (only available for the smallest CU) in inter-frames. Similar to H.264/AVC, the mode decision process in HEVC is performed using all the possible depth levels (or CU sizes) and prediction modes to find the one with the least rate distortion (RD) cost using Lagrange multiplier. This achieves the highest coding efficiency, but leads to a very high computational complexity. Since the optimal prediction mode is highly content dependent, it is not efficient to use all the modes. In this paper, we propose a fast inter-mode decision algorithm for HEVC by jointly using the inter-level correlation of quadtree structure and the spatiotemporal correlation. There exist strong correlations of the prediction mode, the motion vector and RD cost between different depth levels and between spatially temporally adjacent CUs. We statistically analyze the prediction mode distribution at each depth level and the coding information correlation among the adjacent CUs. Based on the analysis results, three adaptive inter-mode decision strategies are proposed including early SKIP mode decision, prediction size correlation-based mode decision and RD cost correlation-based mode decision. Experimental results show that the proposed overall algorithm can save 49%-52% computational complexity on average with negligible loss of coding efficiency, exhibiting applicability to various types of video sequences.
- Research Article
289
- 10.1109/tce.2013.6490261
- Feb 1, 2013
- IEEE Transactions on Consumer Electronics
The emerging international standard of High Efficiency Video Coding (HEVC) is a successor to H.264/AVC. In the joint model of HEVC, the tree structured coding unit (CU) is adopted, which allows recursive splitting into four equally sized blocks. At each depth level, it enables up to 34 intra prediction modes. The intra mode decision process in HEVC is performed using all the possible depth levels and prediction modes to find the one with the least rate distortion (RD) cost using Lagrange multiplier. This achieves the highest coding efficiency but requires a very high computational complexity. In this paper, we propose a fast CU size decision and mode decision algorithm for HEVC intra coding. Since the optimal CU depth level is highly content-dependent, it is not efficient to use a fixed CU depth range for a whole image. Therefore, we can skip some specific depth levels rarely used in spatially nearby CUs. Meanwhile, there are RD cost and prediction mode correlations among different depth levels or spatially nearby CUs. By fully exploiting these correlations, we can skip some prediction modes which are rarely used in the parent CUs in the upper depth levels or spatially nearby CUs. Experimental results demonstrate that the proposed algorithm can save 21% computational complexity on average with negligible loss of coding efficiency.
- Conference Article
70
- 10.1109/icip.2005.1529827
- Jan 1, 2005
An efficient algorithm for compression of dynamic time-consistent 3D meshes is presented. Such a sequence of meshes contains a large degree of temporal statistical dependencies that can be exploited for compression using DPCM. The vertex positions are predicted at the encoder from a previously decoded mesh. The difference vectors are further clustered in an octree approach. Only a representative for a cluster of difference vectors is further processed providing a significant reduction of data rate. The representatives are scaled and quantized and finally entropy coded using CABAC, the arithmetic coding technique used in H.264/MPEG4-AVC. The mesh is then reconstructed at the encoder for prediction of the next mesh. In our experiments we compare the efficiency of the proposed algorithm in terms of bit-rate and quality compared to static mesh coding and interpolator compression indicating a significant improvement in compression efficiency.
- Research Article
7
- 10.1007/s11042-017-5394-2
- Nov 14, 2017
- Multimedia Tools and Applications
Dynamic 3D mesh compression is of great practical important issues in computer graphics and multimedia applications. In this paper, an efficient compression algorithm is proposed to represent animated mesh sequences in a compact way, so that the storage and transmission of dynamic 3D meshes can be accomplished efficiently. The focus of this paper is on the animated mesh sequences with shared connectivity. The proposed method first computes coarse models (low frequency modes) of the animated sequence using the graph Laplacian matrix. Obtained coordinate weights are used at the decoder to reconstruct the coarse models of the sequence. Then, a novel approach is proposed to extract fixed details (high frequency modes or finer features) of the animated mesh. Finally, a details restoration process is applied at the decoder to add details back to the coarse models of the reconstructed sequence. The superiority of the proposed method to the current state of the arts is demonstrated in terms of low data rates for a given degree of perceived distortion.
- Research Article
10
- 10.1016/j.image.2013.05.003
- May 22, 2013
- Signal Processing: Image Communication
Efficient early direct mode decision for multi-view video coding
- Research Article
- 10.6843/nthu.2013.00686
- Jan 1, 2013
- 清華大學資訊工程學系所學位論文
The High Efficiency Video Coding (HEVC) is the latest video compression standard developed by JCT-VC. It adopts a flexible quad-tree structured coding unit (CU). Each CU can be recursively split into four sub-CUs of equal size. At each CU depth level, various combinations of block partitioning types and prediction modes are evaluated to find the one with the least rate distortion (RD) cost. This introduces very high computation complexity. To reduce the encoding time, we propose three early termination schemes for eliminating unpromising splitting. In the first algorithm, calculated RD costs of the parent CU and sibling CUs of the current CU are used to avoid unnecessary CU splitting. In the second scheme, an additional threshold value is employed to terminate the splitting process in advance. In the third approach, we utilize the correlationship between RD costs and prediction modes and an adaptive threshold value to make an even earlier termination decision. Experimental result shows that the proposed algorithm reduces averagely 15.7% of encoding time with negligible quality loss nor bit-rate overhead.
- Conference Article
- 10.1109/3dtv.2009.5069641
- May 1, 2009
In this paper, we propose a Multiple Description Coding (MDC) method for reliable transmission of compressed time consistent 3D dynamic meshes. It trades off reconstruction quality for error resilience to provide the best expected reconstruction of 3D mesh sequence at the decoder side. The method is based on partitioning the mesh vertices into two sets and encoding each set independently by a 3D dynamic mesh coder. The encoded independent bitstreams or socalled descriptions are transmitted independently. The 3D dynamic mesh coder is based on predictive coding with spatial and temporal layered decomposition. In addition, the proposed method allows for different redundancy allocations by duplicating a number of encoded spatial layers in both sets. The algorithm is evaluated with redundancy-rate-distortion curves and flexible trade-off between redundancy and side distortions can be achieved.
- Conference Article
- 10.1117/12.839850
- Jan 17, 2010
- Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
In this paper, we propose a Multiple Description Coding (MDC) method for reliable transmission of compressed time consistent 3D dynamic meshes. It trades off reconstruction quality for error resilience to provide the best expected reconstruction of 3D mesh sequence at the decoder side. The method is based on partitioning the mesh frames into two sets by temporal subsampling and encoding each set independently by a 3D dynamic mesh coder. The encoded independent bitstreams or so-called descriptions are transmitted independently. The 3D dynamic mesh coder is based on predictive coding with spatial and temporal layered decomposition. In addition, the proposed method allows for different redundancy allocations by including a number of encoded spatial layers of the frames in the other set. The algorithm is evaluated with redundancy-rate-distortion curves and it is shown that, when one of the descriptions is lost, acceptable quality can be achieved with around 50% redundancy.
- Research Article
31
- 10.1109/tcsvt.2014.2310143
- Sep 1, 2014
- IEEE Transactions on Circuits and Systems for Video Technology
The multiview video coding (MVC) adopts hierarchical B picture prediction structure and offers many prediction modes to effectively remove the spatial, temporal, and inter-view redundancies inherited in multiview video (MVV), but at the price of extremely high computational complexity. To address this problem, a fast MVC method by jointly using adaptive prediction structure (APS) and hierarchical mode decision (HMD) is proposed in this paper. The complexity reduction is achieved by: 1) designing four APSs for different MVV contents based on the fact that the contribution of the inter-view prediction varies from sequence to sequence and 2) developing an HMD scheme based on the observation that the relationship between the rate distortion (RD) cost and size of prediction mode is a unimodal function. In particular, for the current group of picture of the input MVV, the prediction structure is adaptively selected based on its characteristic, which is measured by the ratio of the average RD cost of the base view frames to the sum of the average RD cost of the base view frames and that of anchor frames in nonbase views, and then an HMD scheme is further performed to skip the checking process of those unlikely modes. The experimental results have shown that compared with the exhaustive mode decision in the MVC, the proposed algorithm achieves a reduction of the computational complexity by 83.49% on average, whereas incurring only a 0.086 dB loss in Bjontegaard delta peak signal-to-noise ratio and 2.97% increment on the total Bjontegaard delta bit rate.