The Joint Video Exploration Team (JVET) has created the Versatile Video Coding Standard (VVC/H.266), the most up-to-date video coding standard, offering a broad selection of coding tools. The maturity of commercial VVC codecs can significantly reduce costs and improve coding efficiency. However, the latest video coding standards have introduced binomial and trinomial tree partitioning methods, which cause the coding units (CUs) to have various shapes, increasing the complexity of coding. This article proposes a technique to simplify VVC intra prediction through the use of gradient analysis and a multi-feature fusion CNN. The gradient of CUs is computed by employing the Sobel operator, the calculation results are used for predecision-making. Further decisions can be made by CNN for coding units that cannot be judged whether they should be segmented or not. We calculate the standard deviation (SD) and the initial depth as the input features of the CNN. To implement this method, the initial depth can be determined by constructing a segmented depth prediction dictionary. For the initial segmentation depth of the coding unit, regardless of its shape, it can also be determined by consulting the dictionary. The algorithm can determine whether to split CUs of varying sizes, decreasing the complexity of the CU division process and making VVC more practical. Experimental results demonstrate that the proposed algorithm can reduce encoding time by 36.56% with a minimal increase of 1.06% Bjøntegaard delta bit rate (BD-BR) compared to the original algorithm.
Read full abstract