Abstract

Compared with High Efficiency Video Coding (HEVC), the latest video coding standard Versatile Video Coding Standard (VVC), due to the introduction of many novel technologies and the introduction of the Quad-tree with nested Multi-type Tree (QTMT) division scheme in the block division method, the coding quality has been greatly improved. Due to the introduction of the QTMT scheme, the encoder needs to perform rate–distortion optimization for each division mode during Coding Unit (CU) division, so as to select the best division mode, which also leads to an increase in coding time and coding complexity. Therefore, we propose a VVC intra prediction complexity reduction algorithm based on statistical theory and the Size-adaptive Convolutional Neural Network (SAE-CNN). The algorithm combines the establishment of a pre-decision dictionary based on statistical theory and a Convolutional Neural Network (CNN) model based on adaptively adjusting the size of the pooling layer to form an adaptive CU size division decision process. The algorithm can make a decision on whether to divide CUs of different sizes, thereby avoiding unnecessary Rate–distortion Optimization (RDO) and reducing coding time. Experimental results show that compared with the original algorithm, our suggested algorithm can save 35.60% of the coding time and only increases the Bjøntegaard Delta Bit Rate (BD-BR) by 0.91%.

Highlights

  • Kim et al [15] recommended a fast-coding unit depth decision algorithm based on the Convolutional Neural Network (CNN), which uses a CNN to predict CTU depth instead of an exhaustive search to calculate the RD cost

  • The Size-adaptive Convolutional Neural Network (SAE-CNN) we propose includes an input layer, four convolutional layers, three pooling layers, a fully connected layer and an output layer

  • This article proposes a Versatile Video Coding Standard (VVC) intra prediction complexity reduction algorithm based on statistical theory and SAE-CNN

Read more

Summary

Introduction

With the continuous improvement of video resolution and frame rate, the amount of data occupied by a video is increasing. Taking a 40-min high-definition color video as an example, using the mainstream 8-bit pixel depth and 30 frames per second playback speed to calculate, its data volume is 1920 × 1080 × 3 × 8 × 30 × 60 × 40 = 3,583,180,800,000 bits, which is obviously not conducive to video storage and real-time transmission. In the process of video transmission, the reconstructed image will be distorted due to the influence of channel noise [1,2]. In order to realize the compression processing of the video, to facilitate the transmission and application, the video coding technology comes into being

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call