Abstract

ABSTRACTVideo encoding based on novel HEVC standard is an extremely computationally expensive process and achieving efficient encoding requires intelligent utilization of all available resources, from both software and hardware perspective. Profiling and analysis of the encoding process identified Discrete cosine transform (DCT) as one of the key kernels that consume most of the time in the application's runtime. Therefore, high-throughput, fully-pipelined hardware accelerator was designed in FPGA and integrated into MANGO platform. MANGO platform is heterogeneous HPC system that consists of different types of nodes, from general purpose nodes (GN) to heterogeneous nodes (HN). While executing specific kernels on GN nodes is a straight-forward process, executing kernels on accelerator-based HNs is a more complex procedure and requires specific integration to successfully exploit heterogeneous architecture. This paper presents performance-efficient integration of DCT hardware accelerator in MANGO platform, focusing on the performance of the encoder while maintaining coding efficiency and video quality of the encoded bitstream. Several approaches were considered, tested and compared; from the standalone integration where series of single tasks were offloaded to the DCT accelerator, to more complex solutions based on smart buffer utilization.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.