OpenCL Kernel Vectorization on the CPU, GPU, and FPGA: A Case Study with Frequent Pattern Compression

Zheming Jin,Hal Finkel

doi:10.1109/fccm.2019.00071

Abstract

OpenCL promotes code portability, and natively supports vectorized data types, which allows developers to potentially take advantage of the single-instruction-multiple-data instructions on CPUs, GPUs, and FPGAs. FPGAs are becoming a promising heterogeneous computing component. In our study, we choose a kernel used in frequent pattern compression as a case study of OpenCL kernel vectorizations on the three computing platforms. We describe different pattern matching approaches for the kernel, and manually vectorize the OpenCL kernel by a factor ranging from 2 to 16. We evaluate the kernel on an Intel Xeon 16-core CPU, an NVIDIA P100 GPU, and a Nallatech 385A FPGA card featuring an Intel Arria 10 GX1150 FPGA. Compared to the optimized kernel that is not vectorized, our vectorization can improve the kernel performance by a factor of 16 on the FPGA. The performance improvement ranges from 1 to 11.4 on the CPU, and from 1.02 to 9.3 on the GPU. The effectiveness of kernel vectorization depends on the work-group size.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

OpenCL Kernel Vectorization on the CPU, GPU, and FPGA: A Case Study with Frequent Pattern Compression

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Optimizing Parallel Reduction on OpenCL FPGA Platform – A Case Study of Frequent Pattern Compression
Zheming Jin ... Hal Finkel
-
Zheming Jin, et. al.Zheming Jin ... Hal Finkel
01 May 2018
01 May 2018

Evaluating Radial Basis Function Kernel on OpenCL FPGA Platform
Zheming Jin ... Hal Finkel
-
Zheming Jin, et. al.Zheming Jin ... Hal Finkel
01 Oct 2018
01 Oct 2018

Evaluating Floating-point Intensive Applications on OpenCL FPGA Platforms: A Case Study on the SimpleMOC Kernel
Zheming Jin ... Hal Finkel
-
Zheming Jin, et. al.Zheming Jin ... Hal Finkel
01 Dec 2018
01 Dec 2018

Automatic OpenCL work-group size selection for multicore CPUs
...
-
, et. al. ...
07 Oct 2013
07 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

OpenCL Kernel Vectorization on the CPU, GPU, and FPGA: A Case Study with Frequent Pattern Compression

Abstract

Talk to us

Similar Papers