Abstract

The latest Versatile Video Coding (VVC) standard has significantly coding efficiency improvement compared with its ancestor High Efficiency Video Coding (HEVC) standard, but at the expense of over-high complexity. As measured by the VVC test model (VTM), the intra-mode comparison and selection in the rate-distortion optimization (RDO) search consume most of the encoding time. In this paper, we propose a deep multi-task learning based fast intra-mode decision approach via adaptively pruning off most redundant modes. First, we create a large-scale intra-mode database for VVC, including both normal angular modes and the newly introduced tools, i.e., intra sub-partition (ISP) and matrix-based intra prediction (MIP). Next, we propose a multi-task intra-mode decision network (MID-Net) model to effectively predict the most probable angular modes and whether to skip ISP and MIP modes. Then, a fast intra-coding workflow is designed accordingly, involving rough mode decision (RMD) acceleration and candidate mode list (CML) pruning. For the workflow output, the learning-oriented probability and the statistics-oriented probability are synthesized together to further improve the prediction accuracy, ensuring that only unnecessary intra-modes are skipped. Finally, experimental results show that our approach can significantly reduce 40.48% of encoding time of VVC intra-coding with negligible rate-distortion degradation, outperforming other state-of-the-art approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call