Abstract

Versatile Video Coding (VVC) is the latest generation of the video coding standard. In VVC, the advanced quadtree with a nested multitype tree (QTMT) partition structure provides more flexible coding unit (CU) partition sizes compared with the quadtree (QT) decision tree structure applied in the previous High Efficiency Video Coding (HEVC) standard. This flexibility, achieved by the new QTMT partitioning improvement, considerably improves the coding performance while increasing the coding computational complexity caused mainly by the rate distortion optimization processing. To overcome the complexity issue, a fast deep intra QTMT decision tree approach based on a convolution neural network (CNN) is adopted to determine the QTMT depth decision of each 128 × 128 Coding Tree Unit (CTU). The proposed algorithm predicts both the BT depths at 32 × 32 CUs and the QT depths at 64 × 64 using trained CNNs designed for each structure instead of processing the RDcost. Experimental results prove that the suggested deep QTMT approach achieves an important complexity reduction of up to 55.51% compared with the original reference software VTM3.0, with an average of about 35% encoding time reduction accompanied by an insignificant loss in encoding performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call