A Complexity Reduction Method for VVC Intra Prediction Based on Statistical Analysis and SAE-CNN

Jinchao Zhao,Qiuwen Zhang,Pu Dai

doi:10.3390/electronics10243112

Jinchao Zhao, Qiuwen Zhang + Show 1 more

Open Access

https://doi.org/10.3390/electronics10243112

Copy DOI

Journal: Electronics	Publication Date: Dec 14, 2021
Citations: 6	License type: CC BY 4.0

Affiliation: Zhengzhou University of Light Industry

Abstract

Compared with High Efficiency Video Coding (HEVC), the latest video coding standard Versatile Video Coding Standard (VVC), due to the introduction of many novel technologies and the introduction of the Quad-tree with nested Multi-type Tree (QTMT) division scheme in the block division method, the coding quality has been greatly improved. Due to the introduction of the QTMT scheme, the encoder needs to perform rate–distortion optimization for each division mode during Coding Unit (CU) division, so as to select the best division mode, which also leads to an increase in coding time and coding complexity. Therefore, we propose a VVC intra prediction complexity reduction algorithm based on statistical theory and the Size-adaptive Convolutional Neural Network (SAE-CNN). The algorithm combines the establishment of a pre-decision dictionary based on statistical theory and a Convolutional Neural Network (CNN) model based on adaptively adjusting the size of the pooling layer to form an adaptive CU size division decision process. The algorithm can make a decision on whether to divide CUs of different sizes, thereby avoiding unnecessary Rate–distortion Optimization (RDO) and reducing coding time. Experimental results show that compared with the original algorithm, our suggested algorithm can save 35.60% of the coding time and only increases the Bjøntegaard Delta Bit Rate (BD-BR) by 0.91%.

Highlights

Kim et al [15] recommended a fast-coding unit depth decision algorithm based on the Convolutional Neural Network (CNN), which uses a CNN to predict CTU depth instead of an exhaustive search to calculate the RD cost
The Size-adaptive Convolutional Neural Network (SAE-CNN) we propose includes an input layer, four convolutional layers, three pooling layers, a fully connected layer and an output layer
This article proposes a Versatile Video Coding Standard (VVC) intra prediction complexity reduction algorithm based on statistical theory and SAE-CNN

Summary

Introduction

With the continuous improvement of video resolution and frame rate, the amount of data occupied by a video is increasing. Taking a 40-min high-definition color video as an example, using the mainstream 8-bit pixel depth and 30 frames per second playback speed to calculate, its data volume is 1920 × 1080 × 3 × 8 × 30 × 60 × 40 = 3,583,180,800,000 bits, which is obviously not conducive to video storage and real-time transmission. In the process of video transmission, the reconstructed image will be distorted due to the influence of channel noise [1,2]. In order to realize the compression processing of the video, to facilitate the transmission and application, the video coding technology comes into being

Objectives

Methods

Results

Conclusion