Abstract

This paper describes the algorithm and processing flow of the in-loop deblocking filter in the VP9 coding standard, one of the most computationally intensive toolsets of the VP9 codec. Due to its inherent data dependency, it is a great challenge to efficiently implement the algorithm on massively parallel computing architectures, such as a General Purpose Graphical Processing Unit (GPGPU). In this paper, we describe the challenges involved in a GPGPU implementation of the VP9 Deblocking filter and introduce an innovative thread dispatching approach to address the parallelization challenges. This approach has been successfully implemented and productized in the VP9 decoder and encoder solutions on Intel GPUs. In order to further improve the parallelism of the deblocking algorithm itself, an improved in-loop deblocking algorithm and process flow is jointly proposed by Intel and Microsoft for the upcoming AV1 codec standard, developed by the Alliance for Open Media (AOM). A description of the algorithm and evaluation of the quality impact of this algorithm is presented with respect to the current state of the art AV1 reference codec.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.