Abstract

The Versatile Video Coding standard introduced a series of novel tools to improve the coding efficiency. However, these tools caused a massive increase in encoder computational complexity, and the affine prediction comprehends a significant share of this complexity. In this context, this work proposes an affine prediction modeling aiming at GPU implementation to accelerate the affine prediction by drawing the most parallelism out of such platforms. This modeling explores parallelism in two levels: conducting the prediction of multiple coding tree units simultaneously and breaking down the prediction into multiple highly-parallel stages. Since classical parallelization approaches for translational motion estimation are not very efficient for affine prediction, the proposed work explores the novel properties and parallelization possibilities introduced by affine prediction. Experimental results show that when applied to blocks 128x128, the proposed work can speed up the affine prediction by 57.21 times when compared to a fully sequential encoder, with a small coding efficiency penalty of 0.16% BD-BR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call