Abstract

This paper presents a comparison of OpenMP and OpenCL based on the parallel implementation of algorithms from various fields of computer applications. The focus of our study is on the performance of benchmark comparing OpenMP and OpenCL. We observed that OpenCL programming model is a good option for mapping threads on different processing cores. Balancing all available cores and allocating sufficient amount of work among all computing units, can lead to improved performance. In our simulation, we used Fedora operating system; a system with Intel Xeon Dual core processor having thread count 24 coupled with NVIDIA Quadro FX 3800 as graphical processing unit.

Highlights

  • Nowadays, Quad-core, multi-core & GPUs [1] have already become the standard for both workstations and high performance computers

  • We compare the performance of these test cases with the OpenCL code on the GPU and on a multi-core CPU with Open MP support

  • Support for recursion is introduced in OpenMP 3.0 specifications by “task “clause. we find that there is no significant improvement in performance, since most of the code to be parallelized is kept in critical section region as shown below: int put(int Queens[], int row, int column)

Read more

Summary

INTRODUCTION

Quad-core, multi-core & GPUs [1] have already become the standard for both workstations and high performance computers These systems use aggressive multithreading so that whenever a thread is stalled, waiting for data, the thread can efficiently switch to execute another thread. A diversity of high-performance architectures, there is a question of which is the best fit for a given workload and extent to which an application benefit from these systems, depends on availability of cores and other workload parameters. This paper addresses these issues by implementing parallel algorithms for the four test cases and compares their performance in terms of time taken to execute and percentage of speed-up factor achieved.

PARALLEL COMPUTING PARADIGM
Shared Memory System
Distributed Memory System
EXPERIMENTAL RESULTS
Matrix Multiplication
Image Convolution
String Reversal
RELATED WORK
CONCLUSION & FUTURE SCOPE
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call