Versatile video coding (VVC) is the next generation video coding standard released in July 2020. VVC introduces new coding tools enhancing the coding efficiency compared to its predecessor high efficiency video coding (HEVC). These new tools have a significant impact on the VVC software decoder with a complexity estimated to two times HEVC decoder complexity. In particular, the adaptive loop filter (ALF) introduced in VVC as an in-loop filter increases both the decoding complexity and memory usage. These concerns need to be carefully addressed regarding the design of an efficient hardware implementation of a VVC decoder. In this paper, we present an efficient hardware implementation of the ALF tool for VVC decoder. The proposed solution establishes a novel scanning order between Luma and Chroma components that reduces significantly the ALF memory. The design takes advantage of all ALF features and establishes an unified hardware module for all ALF filters. The design uses 26 regular multipliers in a pipelined architecture with a fixed throughput of 2 pixels/cycle and fixed system latency regardless of the selected filter. This design operates at 600 MHz frequency enabling to decode on ASIC platform a 4K video at 30 frames per second in 4:2:2 chroma sub-sampling format.
Read full abstract