Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA

Sabine Mcconnell,Gregory Henry,Andrew Mayne,Robert Sturgeon,Richard Hurley

doi:10.1088/1742-6596/341/1/012018

Sabine Mcconnell, Gregory Henry + Show 3 more

Open Access

https://doi.org/10.1088/1742-6596/341/1/012018

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Feb 9, 2012
Citations: 20	License type: cc-iop-open

Affiliation: Trent University

Abstract

We evaluate a novel implementation of a Self-Organizing Map (SOM) on a Graphics Processing Unit (GPU) cluster. Using various combinations of OpenCL, CUDA, and two different graphics cards, we demonstrate the scalability of the SOM implementation on one to eight GPUs. Results indicate that while the algorithm scales well with the number of training samples and the map size, the benefits from using the data-parallel approaches offered by the GPU are severely limited when combined with the Message Passing Interface (MPI) in this setting, and comparable to speedups of GPU-based implementations as compared to optimized sequential code. Speedups achieved range from 3 to 32, for various map and training data sizes. We also observed a performance penalty for the OpenCL implementation as compared to CUDA.

Full Text