Abstract

This article presents the hardware design of the 16x16 2-D DCT used in the new video coding standard, the HEVC – High Efficiency Video Coding. The transforms stage is one of the innovations proposed by HEVC, since a variable size transforms stage is available (from 4x4 to 32x32), allowing the use of transforms with larger dimensions than used in previous standards. The presented design explores the 2-D DCT separability property, using two instances of the one-dimension DCT. The architecture focuses on low hardware cost and high throughput, thus the HEVC 16-points DCT algorithm was simplified targeting a more efficient hardware implementation. Operations and hardware minimization strategies were used in order to achieve such simplifications: operation reordering, factoring, multiplications to shift-adds conversion, and sharing of common sub-expressions. The 1-D DCT architectures were designed in a fully combinational way in order to reduce control overhead. A transposition buffer is used to connect the two 1-D DCT architectures. The synthesis was directed to Stratix III FPGA and TSMC 65nm standard cells technologies. The complete 2-D DCT architecture is able to achieve real-time processing for high and ultra-high definition videos, such as Full HD, QFHD and UHD 8K. When compared with related works, the architectures designed in this work reached the highest throughput and the lowest hardware resources consumption.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.