Abstract

With more complex partitioning structures in recent generations of video coding standards, the computation complexity of video encoder for partition block size search has been increasing drastically. To expedite the overall encoding process, it is desired to make faster partitioning decisions without much compression performance degradation. In this paper, we propose a multi-scale multi-stage machine learning(ML) based framework to accelerate partition block size search. The framework includes a collection of ML models, each dedicated to make a simple decision for a particular block size at a particular stage during the partitioning rate-distortion optimization(RDO) process. The ML models can predict whether the RD evaluation of certain partition block sizes can be skipped, saving unnecessary computation in the encoder. The proposed approach is implemented and tested on VP9 with the open source library libvpx. Significant encoding speed improvement has been observed with neglectable compression performance regression. The framework and methodology can be easily applied to other video codecs and implementations as well.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call