Subblock-Based Combined Inter and Intra Prediction Beyond VVC
Combined Inter and Intra Prediction (CIIP) is a coding tool that blends an inter prediction and an intra prediction to generate a hybrid prediction, which is rather efficient when neither intra nor inter prediction alone can well capture the content characteristic of a coding block. Due to the considerable coding performance, CIIP has been adopted in versatile video coding (VVC), and has been enhanced with further extensions in the recent enhanced compression model (ECM). However, CIIP is still less effective to handle the scenarios with complex motions since only translational motion model is leveraged. To tackle such issue, this paper investigates a method of subblock-based CIIP (subblock-CIIP), where the motion compensation signal of affine or subblock-based TMVP (SbTMVP) mode can be combined with an intra prediction to produce the CIIP prediction. Simulation results on ECM-11.0 indicate subblock-CIIP can achieve average 0.24% coding gain on sequences with abundant affine motions. The proposed subblock-CIIP method has been adopted in ECM-12.0.
- Book Chapter
- 10.5772/19308
- Jun 24, 2011
H.264/AVC is currently one of the most commonly used video coding standards. The compression efficiency of H.264/AVC is higher than any other previous video coding standards, as it includes more sophisticated coding techniques, such as intra prediction, variable block size motion estimation, rate-distortion optimized mode decision, and entropy coding (Luthra et al., 2003; Sullivan & Wiegand, 2005; Wiegand et al., 2003). Intra prediction is an important technique in image and video coding to reduce the spatial redundancy between spatially adjacent blocks. Unlike previous coding standards such as H.263+ and MPEG-4 Part-2, in which intra predictions were performed in the transform domain, the intra prediction of H.264/AVC is completely defined in the pixel domain by referring to neighboring samples of coded blocks (Sullivan et al., 2004; Sullivan et al., 2004). Recently, many intra prediction approaches have been proposed. To capture the local information of neighboring reconstructed samples more accurately, 34 prediction modes are employed in angular intra prediction for the Intra_88 mode (Ugur et al., 2010), and arbitrary directional intra (ADI) for the Intra_1616 mode (McCann et al., 2010). In order to combine two types of the H.264/AVC intra prediction modes, bidirectional intra prediction (BIP) is proposed (Matsuo et al., 2007). In some cases, the image blocks have repeated patterns instead of distinctive direction information. In this case, utilizing the global information in place of the spatial neighboring samples will bring better coding efficiency. Related works include intra displacement compensation (IDC) (Yu & Chrysafis, 2002) and template matching (TM). IDC uses an intradisplacement vector per block partition to get the reference samples. In TM, they choose to match the templates which have already been reconstructed. Further enhancements using the TM scheme are matching using a single template (Tan et al., 2006), backward-adaptive texture synthesis (Wei & Levoy, 2000), multiple candidates (Tan et al., 2007), priority-guided template matching (Guo et al., 2008), and locally adaptive illumination compensation (Zheng et al., 2008). Although these approaches improve the coding performance, they still suffer from the limitation of the block-based structure. In the block-based structure, it is difficult to predict the samples far from the reconstructed block boundaries. Thus, a new intra prediction method, line-based intra prediction (Sohn & Han, 2007; Peng et al., 2010) is suggested. Until now, line-based coding seems to overcome the shortcomings of block-based prediction, because each line within the current block shares an equal processing and is predicted and
- Book Chapter
- 10.1007/11573036_60
- Jan 1, 2005
The H.264 video coding standard can achieve considerably higher coding efficiency than previous standards. The key to this high code efficiency are mainly the Intra and Inter prediction modes provided by the standard. However, the compression efficiency of the H264 standard comes at the cost of increased complexity of the encoder. Therefore it is very important to design video architectures that minimize the cost of the prediction modes in terms of area, power dissipation and design complexity. A common aspect of the Inter and Intra Prediction modes, is the Sum of Absolute Differences (SAD). In this paper we present a new algorithm that can replace the SAD in Intra Prediction, and which provides a more efficient hardware implementation.
- Research Article
1
- 10.5909/jbe.2008.13.2.200
- Mar 31, 2008
- Journal of Broadcast Engineering
H.264|MPEG-4 AVC는 ITU-T와 ISO/IEC 공동으로 결성된 JVT (Joint Video Team)에 의해서 정의된 가장 최신의 영상 압축 표준이다. H.264|MPE6-4 AVC는 효율적 부호화를 위하여 여러 방법이 제안되었는데, 화면 간 프레임(P-frame)에서의 화면 내 예측(Intra Prediction)의 경우 매크로블록마다 후보 모드 결정 및 율-왜곡 비용 계산에 따른 부호화 시간의 급격한 증가를 초래하여 고속화 방법의 필요성이 대두되고 있다 본 논문에서는 <TEX>$16{\times}16$</TEX>과 <TEX>$4{\times}4$</TEX> 화면 내 예측 부호화 결과를 바탕으로, 두 예측 결과의 통계적 상관관계를 규정한 후, 이를 활용한 <TEX>$4{\times}4$</TEX> 화면 내 예측의 후보 모드 수를 감소시키는 방법을 제안한다. 구체적으로는 화면 간 예측(Inter Prediction) 단계에서 결정된 움직임 벡터 정보를 이용하여 현재 매크로블록의 화면 내 예측이 필요한지를 미리 판정한 후, 매 화면 내 프레임(I-frame)의 <TEX>$16{\times}16$</TEX> 화면 내 예측의 최종 후보 모드에 따른 <TEX>$4{\times}4$</TEX> 화면 내 예측의 최종 결정 모드들의 발생분포를 누적 확률 순으로 배열하여 특정 누적 확률에 도달하기까지 만의 후보 모드들만을 예측에 포함하는 참조 테이블을 부호화 과정 중에 생성한 후 동일 GOP 내에 위치하는 모든 화면 간 프레임의 화면 내 예측 시 활용하게 된다. 제안하는 방법은 H.264|MPEG-4 AVC의 참조 소프트웨어인 JM11.0을 사용하여 실험하였으며, 총 부호화 시간을 최대 51.24% 감소시킬 수 있었으며 PSNR 감소와 비트율 증가는 무시할 정도의 작은 변화만 있었다. H.264| MPEG-4 AVC is a new video codingstandard defined by JVT (Joint Video Team) which consists of ITU-T and ISO/IEC. Many techniques are adopted fur the compression efficiency: Especially, an intra prediction in an inter frame is one example but it leads to excessive amount of encoding time due to the decision of a candidate mode and a RDcost calculation. For this reason, a fast determination of the best intra prediction mode is the main issue for saving the encoding time. In this paper, by using the result of statistical relation between intra <TEX>$16{\times}16$</TEX> and <TEX>$4{\times}4$</TEX> intra predictions, the number of candidate modes for <TEX>$4{\times}4$</TEX> intra prediction is reduced. Firstly, utilizing motion vector obtained after inter prediction, prediction of a block mode for each macroblock is made. If an intra prediction is needed, the correlation table between <TEX>$16{\times}16$</TEX> and <TEX>$4{\times}4$</TEX> intra predicted modes is created using the probability during each I frame-coding process. Secondly, using this result, the candidate modes for a <TEX>$4{\times}4$</TEX> intra prediction that reaches a predefined specific probability value are only considered in the same GOP For the experiments, JM11.0, the reference software of H.264|MPEG-4 AVC is used and the experimental results show that the encoding time could be reduced by 51.24% in maximum with negligible amounts of PSNR drop and bitrate increase.
- Research Article
39
- 10.1109/tip.2017.2740161
- Aug 14, 2017
- IEEE Transactions on Image Processing
As an extension of High Efficiency Video Coding (HEVC), the Scalable High Efficiency Video Coding (SHVC) introduces multiple layers with inter-layer predictions, which greatly increases the complexity on top of the already complicated HEVC encoder. In Intra prediction for Quality SHVC, Coding Tree Unit (CTU) allows recursive splitting into four depth levels, which considers 35 Intra prediction modes and interlayer reference (ILR) mode to determine the best possible mode at each depth level. This achieves the highest coding efficiency but incurs a substantially high computational complexity. In this paper, we propose a novel Intra prediction scheme to effectively speed up the enhancement layer Intra-coding in Quality SHVC. The new features of the proposed framework include: First, spatial correlation and its correlation degree are combined to predict most probable depth level candidates. Second, for a given depth candidate, based on the probabilities of the ILR mode, we check the ILR mode by examining the residual distribution based on skewness and kurtosis to determine whether the residuals follow a Gaussian distribution. In that case, the Intra prediction comparisons, which require a high complexity, are skipped. Third, during Intra prediction selection from 35 Intra prediction modes, spatial and inter-layer correlations are combined with the local monotonicity of the Hadamard costs associated with the modes in a small neighborhood, to examine only a portion of Intra prediction modes. Finally, a hypothesis testing on the currently selected depth level is performed to examine whether the residuals present significant differences within their block to early terminate depth selection. The proposed multi-step multistrategy scheme aims to minimize the number of depth selections while greatly reducing the mode decision complexity for a depth candidate in a hierarchical fashion. Our experimental results demonstrate that the proposed scheme can achieve a speedup gain of more than 75% in average on the test video sequences, while maintaining almost the same coding efficiency. .
- Conference Article
25
- 10.1109/mmsp.2011.6093805
- Oct 1, 2011
In the current working draft of HEVC, residual quad-tree (RQT) coding is used to encode prediction residuals in both Intra and Inter coding units (CU). However, the rationale for using RQT as a coding tool is different in the two cases. For Intra prediction units, RQT provides an efficient syntax for coding a number of sub-blocks with the same intra prediction mode. For Inter CUs, RQT adapts to the spatial-frequency variations of the CU, using as large a transform size as possible while catering to local variations in residual statistics. While providing coding gains, effective use of RQT currently requires an exhaustive search of all possible combinations of transform sizes within a block. In this paper, we exploit our insights to develop two fast RQT algorithms, each designed to meet the needs of Intra and Inter prediction residual coding.
- Conference Article
93
- 10.1109/icip.2019.8803777
- Sep 1, 2019
The upcoming Versatile Video Coding (VVC) Standard includes various new intra prediction tools that were not present in its predecessor High Efficiency Video Coding (HEVC), such as the wide angle intra prediction, the position dependent prediction combination or the multiple reference line intra prediction. In order to improve the intra prediction coding efficiency, this paper proposes the usage of the Intra Subpartition (ISP) algorithm. ISP is an updated version of the Line-Based Intra Prediction (LIP) mode that improves the trade-off between coding gain and complexity of the original method. The basic principle of ISP consists in subdividing an intra-predicted block into 2 or 4 subpartitions of at least 16 samples according to the original block dimensions. The method has been implemented on top of the VVC Test Model 3.0 (VTM-3.0), with the result of obtaining a gain of 0.57% with encoding and decoding run-times of 112% and 102% respectively for the All Intra configuration and a gain of 0.29% with encoding and decoding run-times of 102% and 101% respectively for the Random Access configuration.
- Conference Article
19
- 10.1109/pcs.2018.8456305
- Jun 1, 2018
In existing video coding standards such as H.264/AVC and HEVC, the intra prediction is typically derived using fixed, symmetric prediction filters along the prediction direction, e.g., in planar mode, top-right and bottom-left samples are predicted using symmetric prediction filters. However, in case ofasymmetric availability of neighboring reference samples, the performance of intra prediction filters designed in HEVC may not be optimal. To further refine the intra prediction and achieve higher accuracy of prediction samples, this paper proposes low-complexity refinements over HEVC intra prediction, which are applied on frequently used planar, DC, horizontal and vertical modes. The proposed method only requires simple addition and bit-shift operations on top of HEVC’s intra prediction implementation. Experimental results show that, an average of 0.7% coding gain is achieved for intra coding with no increase in run-time complexity.
- Conference Article
13
- 10.1109/vcip.2018.8698658
- Dec 1, 2018
Intra prediction in modern video codecs is able to efficiently reduce spatial redundancy in video frames. With preceding pixels as context, traditional intra prediction schemes generate linear predictions based on several predefined directions (i.e. modes) for the current prediction unit (PU). However, these modes are relatively simple and are not able to handle complex textures, which leads to additional bits encoding the residue. In this paper, we design a convolutional neural network (CNN) guided spatial recurrent neural network (RNN) to improve the intra prediction in High-Efficiency Video Coding (HEVC). By exploring the correlations between pixels, the network learns to generate prediction signal in a progressive manner. The progressive model solves the problem of asymmetry in intra prediction naturally. As the model is designed for global context modeling, no flags for intra prediction modes selection need to be encoded. Our proposed intra prediction scheme achieves on average 1.2% bit-rate saving compared with HEVC.
- Conference Article
3
- 10.1109/pcs.2016.7906394
- Jan 1, 2016
Intra prediction is an important component of video compression. The High Efficiency Video Coding (HEVC) standard supports advanced Intra prediction tools to achieve remarkable Intra coding performance. However, higher compression efficiency performance may be achieved using Combined Intra Prediction (CIP). CIP consists in combining conventional Intra reference samples, with samples extracted from within the current block being encoded. In this paper, an improvement to the original CIP approach is proposed to achieve higher compression efficiency. Due to the fact the encoder does not have access to reconstruction samples while compressing a block, a controlled drift occurs at the block level between the Intra prediction blocks generated at the encoder and decoder when using CIP. The proposed approach aims at reducing such a drift, hence improving CIP performance. This Improved CIP is able to achieve up to 1.4 % BD-rate savings with respect to the HEVC reference software. The paper also proposes the combination of Improved CIP with an additional tool for enhancing the accuracy of Intra prediction. Up to 1.9 % BD-rate savings can be achieved using this approach.
- Conference Article
2
- 10.1109/vcip53242.2021.9675443
- Dec 5, 2021
Intra prediction is typically used to exploit the spatial redundancy in video coding. In the latest video coding standard Versatile Video Coding (VVC), 67 intra prediction modes are adopted in intra prediction. The encoder selects the best one from 67 modes and signals it to the decoder. Bits consuming of signaling the selected mode may limit the coding efficiency. To reduce the overhead of signaling the intra prediction mode, a probability-based decoder-side intra mode derivation (P-DIMD) is proposed in this paper. Specifically, an intra prediction mode candidate set is constructed based on the probabilities of intra prediction modes. The probability of an intra prediction mode is mainly estimated in two ways. First, the textures are typically continuous within a local region and intra prediction modes of neighboring blocks are similar to each other. Second, some intra prediction modes are preferable to be used than others. For each intra prediction mode in the constructed candidate set, intra prediction is processed on a template to calculate a cost. The intra prediction mode with the minimum cost is determined as the optimal mode and used in the intra prediction of the current block. Experimental results demonstrate that P-DIMD can achieve 0.56% BD-rate saving on average compared to VTM-11.0 under all intra configuration.
- Conference Article
10
- 10.1109/dicta.2011.114
- Dec 1, 2011
This paper presents an efficient method in deciding macro-block mode and selecting prediction mode for intra prediction in H.264/AVC high profile. H.264/AVC supports nine intra prediction modes for luminance 4x4 and 8x8 blocks. The predictors include 8 directional modes and intra DC mode, which is non-directional mode. For luminance 16x16 blocks, the following 4 intra prediction modes are used: Vertical, Horizontal, DC and Plane mode. To select the best prediction mode, rate-distortion costs of all possible prediction modes are calculated exhaustively. Thus, intra prediction process in H.264/AVC has high computational complexity. Especially, in high profile, it is extremely large since luminance 8x8 blocks are added as candidate macro-block mode. To overcome this issue, we proposed efficient macro-block mode decision and prediction mode selection for intra prediction in H.264/AVC high profile. We employed variance of current macro-block to decide macro-block mode and proposed improved prediction mode selection to select intra prediction mode. The experimental results of proposed method show that the proposed algorithm reduces encoding time 83.194% on average (up to 87.081%) without noticeable rate distortion performance loss.
- Conference Article
5
- 10.1117/12.2274035
- Sep 19, 2017
The demand for streaming video content is on the rise and growing exponentially. Networks bandwidth is very costly and therefore there is a constant effort to improve video compression rates and enable the sending of reduced data volumes while retaining quality of experience (QoE). One basic feature that utilizes the spatial correlation of pixels for video compression is Intra-Prediction, which determines the codec's compression efficiency. Intra prediction enables significant reduction of the Intra-Frame (I frame) size and, therefore, contributes to efficient exploitation of bandwidth. In this presentation, we propose new Intra-Prediction algorithms that improve the AV1 prediction model and provide better compression ratios. Two (2) types of methods are considered: )1( New scanning order method that maximizes spatial correlation in order to reduce prediction error; and )2( New Intra-Prediction modes implementation in AVI. Modern video coding standards, including AVI codec, utilize fixed scan orders in processing blocks during intra coding. The fixed scan orders typically result in residual blocks with high prediction error mainly in blocks with edges. This means that the fixed scan orders cannot fully exploit the content-adaptive spatial correlations between adjacent blocks, thus the bitrate after compression tends to be large. To reduce the bitrate induced by inaccurate intra prediction, the proposed approach adaptively chooses the scanning order of blocks according to criteria of firstly predicting blocks with maximum number of surrounding, already Inter-Predicted blocks. Using the modified scanning order method and the new modes has reduced the MSE by up to five (5) times when compared to conventional TM mode / Raster scan and up to two (2) times when compared to conventional CALIC mode / Raster scan, depending on the image characteristics (which determines the percentage of blocks predicted with Inter-Prediction, which in turn impacts the efficiency of the new scanning method). For the same cases, the PSNR was shown to improve by up to 7.4dB and up to 4 dB, respectively. The new modes have yielded 5% improvement in BD-Rate over traditionally used modes, when run on K-Frame, which is expected to yield ~1% of overall improvement.
- Research Article
4
- 10.1109/tip.2023.3286256
- Jan 1, 2023
- IEEE Transactions on Image Processing
Intra prediction is a crucial part of video compression, which utilizes local information in images to eliminate spatial redundancy. As the state-of-the-art video coding standard, Versatile Video Coding (H.266/VVC) employs multiple directional prediction modes in intra prediction to find the texture trend of local areas. Then the prediction is made based on reference samples in the selected direction. Recently, neural network-based intra prediction has achieved great success. Deep network models are trained and applied to assist the HEVC and VVC intra modes. In this paper, we propose a novel tree-structured data clustering-driven neural network (dubbed TreeNet) for intra prediction, which builds the networks and clusters the training data in a tree-structured manner. Specifically, in each network split and training process of TreeNet, every parent network on a leaf node is split into two child networks by adding or subtracting Gaussian random noise. Then data clustering-driven training is applied to train the two derived child networks using the clustered training data of their parent. On the one hand, the networks at the same level in TreeNet are trained with non-overlapping clustered datasets, and thus they can learn different prediction abilities. On the other hand, the networks at different levels are trained with hierarchically clustered datasets, and thus they will have different generalization abilities. TreeNet is integrated into VVC to assist or replace intra prediction modes to test its performance. In addition, a fast termination strategy is proposed to accelerate the search of TreeNet. The experimental results demonstrate that when TreeNet is used to assist the VVC Intra modes, TreeNet with depth = 3 can bring an average of 3.78% bitrate saving (up to 8.12%) over VTM-17.0. If TreeNet with the same depth replaces all VVC intra modes, an average of 1.59% bitrate saving can be reached.
- Conference Article
77
- 10.1109/dcc.2017.53
- Apr 1, 2017
Traditional intra prediction methods for HEVC rely on using the nearest reference lines for predicting a block, which ignore much richer context between the current block and its neighboring blocks and therefore cause inaccurate prediction especially when weak spatial correlation exists between the current block and the reference lines. To overcome this problem, in this paper, an intra-prediction convolutional neural network (IPCNN) is proposed for intra prediction, which exploits the rich context of the current block and therefore is capable of improving the accuracy of predicting the current block. Meanwhile, the reconstruction of the three nearest blocks can also be refined. To the best of our knowledge, this is the first paper that directly applies CNNs to intra prediction for HEVC. Experimental results validate the effectiveness of applying CNNs to intra prediction and the proposed method can achieve 0.70% bitrate reduction compared to HEVC reference software HM-14.0.
- Research Article
1
- 10.1007/s11042-020-09544-8
- Sep 21, 2020
- Multimedia Tools and Applications
Omnidirectional videos or 360 degree videos play an important role in virtual reality (VR) applications. In order to employ the existing video coding standards, omnidirectional videos are firstly projected onto the 2-Dimension (2D) plane, which generates the discontinuity at boundaries and may result in unexpected artifacts when lossy coding is applied. In this paper, we propose a new intra prediction method to deal with the above coding artefacts in omnidirectional videos. Different from the conventional intra prediction using the left and top reference samples, the right reference samples are derived from the reconstructed samples on the left boundary and padded for intra prediction. The proposed method applies to the planar, DC and partial angular prediction modes in intra prediction. The experimental results demonstrate a Bjontegaard-Delta-rate reduction of up to 2.97% using weighted spherical peak-signal to noise ratio (WSPSNR) quality metric for the coding tree units (CTUs) at the right boundary.