The Versatile Video Coding (VVC) has shown significant improvements in Rate-Distortion (R-D) performance compared to its predecessor, High Efficiency Video Coding (HEVC). However, it still encounters several challenges. One of these challenges is the efficient allocation of bits among all Coding Tree Units (CTUs). Additionally, there is a lack of prior information for intra-frame coding, particularly for the first frame. After CTU-level bit allocation, only fixed parameters can be used to determine the λ for CTUs, which does not result in optimal rate–distortion performance. To tackle above challenges, we propose a rate control solution based on Convolutional Neural Network (CNN). This approach utilizes CNN to predict the key parameters α and β in the R-D model, addressing the problem of lacking prior information in intra-frame coding. Subsequently, the predicted α and β values are used to adaptively allocate bits for each CTU. Our proposed algorithm is implemented in VTM-16.0 under Common Test Conditions (CTC). Experimental results show that, compared to the default rate control algorithm in VTM-16.0, our proposed algorithm enhances R-D performance by 0.96% while maintaining rate control accuracy.
Read full abstract