Programmable Graphics Processing Units (GPUs) have lately become a promising means to perform scientific computations. Modern GPUs have proven to outperform the number of floating point operations when compared to traditional Central Processing Units (CPUs) through inherent data parallel architecture and higher bandwidth capabilities. They allow scientific computations to be performed without noticeable degradation in accuracy in a fraction of the time compared to traditional CPUs at substantially reduced costs, making them viable alternatives to expensive computer clusters or workstations. GPU programmability however, has fostered the development of a variety of programming languages making it challenging to select a computing language and use it consistently without the pitfall of being obsolete. Some GPU languages are hardware specific and are designed to rake in performance boosts when used with their host GPUs (e.g., Nvidia Cuda). Others are operating system specific (e.g., Microsoft HLSL). A few are platform agnostic lending themselves to be used on a workstation with any CPU and a GPU (e.g., GLSL, OpenCL). Of a number of companies and organizations that implement formal optimization into their processes, only a few utilize GPUs. It is either because the others are either vested much into CPU based computing or they are not fully aware of the benefits of implementing population based optimization routines in GPUs. Literature shows a large number of research publications specifically in the field of optimization utilizing GPUs. However, most of them are limited to a specific GPU hardware or addressed specific problems. The diversity in current GPU hardware and software APIs present overwhelming number of choices making it challenging to decide where and how to begin transitioning to GPU based computing, impeding promising computing avenues that relatively is very cost effective. In this paper, the authors precisely intend to address some of these issues by broadly classifying GPU APIs into three categories: 1) Hardware vendor dependent GPU APIs, 2) Graphical in context APIs, and 3) Platform agnostic APIs. Prior work by the authors demonstrated the capability of digital pheromones within Particle Swarm Optimization (PSO) for searching n-dimensional design spaces with improved accuracy, efficiency and reliability in serial and parallel CPU computing environments. To study the impact of GPUs, the authors have taken this digital pheromone variant of PSO and implemented it on three GPU APIs, each representing a category listed above, in a simplistic sense --- delegate unconstrained explicit objective function evaluations to GPUs. While this approach itself cannot be considered novel, the takeaways from implementing it on different GPU APIs provided a wealth of information that the authors believe can help optimization companies and organizations make informed decisions in implementing GPUs in their processes.
Read full abstract