Abstract
This paper introduces a dual hybrid neural network model combining convolutional neural networks (CNNs) and artificial neural networks (ANNs) to optimize the quantization parameter (QP) for both 64×64 and 32×32 blocks in the versatile video coding (VVC) standard, enhancing video quality and compression efficiency. The model employs CNNs for spatial feature extraction and ANNs for structured data handling, addressing the limitations of current heuristic and just noticeable distortion (JND)-based methods. A dataset of luminance channel image blocks, encoded with various QP values, is generated and preprocessed, and the dual hybrid network structure is designed with convolutional and dense layers. The QP optimization is applied at two levels: the 64×64 model provides a global QP offset, while the 32×32 model refines the QP for further partitioned blocks. Performance evaluations using model error metrics like mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), as well as perceptual metrics like weighted PSNR (WPSNR), MS-SSIM, PSNR-HVS-M, and VMAF, demonstrate the model’s effectiveness. While our approach performs competitively with state-of-the-art algorithms, it significantly outperforms in VMAF, the most advanced and widely adopted perceptual quality metric. Furthermore, the dual-model approach yields better results at lower resolutions, whereas the single-model approach is more effective at higher resolutions. These results highlight the adaptability of the proposed models, offering improvements in both compression efficiency and perceptual quality, making them highly suitable for practical applications in modern video coding.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have