Abstract

Field-programmable gate array (FPGA) is a promising choice as a heterogeneous computing component for energy-aware and high-performance applications. Emerging high-level synthesis (HLS) tools such as Intel FPGA Software Development Kit for Open Computing Language (OpenCL) offer a streamlined design flow to facilitate the use of FPGAs for scientists and researchers. In this paper, we focus on the optimizations of the OpenCL design of the Bob Jenkins lookup3 hash function which is used in the open source software version of Memcached. We describe in details the optimizations of the kernel on the FPGA, and evaluate the resource utilizations, performance, and performance per watt of the kernel implementations on an Arria10-based FPGA platform. The experimental results show that the optimized design can achieve 3.46X speedup in kernel execution time compared to the baseline implementation on the Nallatech 385A FPGA card that features an Arria 10 GX 1150 FPGA chip. For the performance per watt, we achieve 8 MHash/watt on the Arria 10 FPGA, which is 14X and 1.2X improvement over an Intel Xeon E5 CPU and an Nvidia K80 GPU, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call