Abstract

Intel recently introduced the Heterogeneous Architecture Research Platform, HARP. In this platform, the Central Processing Unit and a Field-Programmable Gate Array are connected through a high-bandwidth, low-latency interconnect and both share DRAM memory. For this platform, Open Computing Language (OpenCL), a High-Level Synthesis (HLS) language, is made available. By making use of HLS, a faster design cycle can be achieved compared to programming in a traditional hardware description language. This, however, comes at the cost of having less control over the hardware implementation. We will investigate how OpenCL can be applied to implement a real-time guided image filter on the HARP platform. In the first phase, the performance-critical parameters of the OpenCL programming model are defined using several specialized benchmarks. In a second phase, the guided image filter algorithm is implemented using the insights gained in the first phase. Both a floating-point and a fixed-point implementation were developed for this algorithm, based on a sliding window implementation. This resulted in a maximum floating-point performance of 135 GFLOPS, a maximum fixed-point performance of 430 GOPS and a throughput of HD color images at 74 frames per second.

Highlights

  • Ever since the introduction of the von Neumann architecture there has been a need for growing computing speed

  • The goal of this research was to evaluate the Open Computing Language (OpenCL) as an High-Level Synthesis (HLS) language for the implementation of a guided image filter on the HARP platform. We have performed this evaluation by using synthetic benchmarks to measure the maximum bandwidth, the influence of the OpenCL cache as well as SVM, and using the results as guidelines to implement and optimize the guided image filter

  • Based on our findings it can be stated that the HARP platform offers several architectural structures that benefit the implementation of applications such as the guided image filter

Read more

Summary

Introduction

Ever since the introduction of the von Neumann architecture there has been a need for growing computing speed. Until around 2015, improved architectural design, faster memories, domain specific accelerators, and high-performance oriented languages enabled computing power to keep up with the most demanding applications. The growth is slowing down by the limits of Dennard scaling and Moore’s law. Dennard scaling [1] started to fall off for designs at less than 65 nm due to increasing leakage power and the limit of Moore’s law slows the performance progress to 3% per year [2]. Architectural innovations include a cache-coherent interface with transparent address translation, high-speed wide communication path and the support of a high-level synthesis language Open Computing Language (OpenCL).

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.