This paper presents a coding unit (CU) depth-decision algorithm using neural networks to reduce the computational overhead of High Efficiency Video Coding (HEVC). The coding tree unit (CTU) of HEVC has a quad-tree structure, and its computational complexity is considerably high because it searches an optimal CU depth from the upper to lower depth recursively and exhaustively. In the proposed method, neural networks are used to predict the CTU depth. A database for neural networks is constructed, which considers both the image and encoding properties of the CU. It consists of the image data representing the image value of the CU, the vector data based on the encoding information of the CU, and the labels indicating whether the CU is divided. By using both properties of the CU, high test accuracy can be achieved. It is completely separated from the test sequence used for encoding and can be configured to use a sequence with various resolutions, motions, and contents for diverse CUs. We also design a neural-network architecture and perform training. The architecture consists of the convolution and pooling layers for analyzing the image property of the CU. The feature map is concatenated with the vector data and trained by fully connected layers in order to analyze the encoding property of the CU. Finally, a fast CU depth-decision algorithm is designed based on the trained neural networks. When the result of the neural network inference with the current CU depth is non-split, the operation on the lower CU depth is skipped. The experimental results show that the proposed method can reduce the computational overhead by 61.77% on average, and by a maximum of 73.45% with 3.91% Bjøntegaard-Difference-bitrate (BD rate) degradation.
Read full abstract