Abstract
There is continuous thrust on improved and innovative video solution to facilitate video conferencing, video surveillance, transcoding, streaming video and many more customer centric new solutions. Increasing frame rate and frame size demands high performance hardware accelerators (HWA) to enable efficient 16×16 pixels macroblock level (MB) pipelining inside video processing engine (IVAHD). Inloop de-blocking filter of H.264 codec reduces blocking artifacts in MB and it is very demanding in terms of cycles and resources (memory access and memory storage). Removal of blocking artifacts due to block-based video codecs takes around 20-25% of overall decoder complexity in current generation of standards (H.264) and trend will continue going forward in H.265. Higher adaptability of filter process, smaller block sizes (4×4), motion vector (MV) dependent boundary strength (BS) computation for each edge of 4×4 block, predefined order for doing filtering (vertical edge followed by horizontal edge) and data pixel loading of current and neighbor MB requires large number of accesses to shared memory of IVAHD (SL2), higher processing cycles and larger internal pixel buffer (IPB). This paper discusses novel approach of loop filter (LPF) operation to overcome above barriers and facilitate IVAHD to go up to 240fps frame rate in full HD processing of H.264 codec with leadership area and power. The final design in 28nm CMOS process is expected to take around 0.10 mm2 after actual place and route (consisting of 220 KGate with 5 KB of internal memory). Proposed design is capable of handling 4K@60fps and scalable to support H.265 inloop de-blocking filter.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have