Multi-GPU-based Swendsen–Wang multi-cluster algorithm with reduced data traffic

Yukihiro Komura

doi:10.1016/j.cpc.2015.04.025

Abstract

The computational performance of multi-GPU applications can be degraded by the data communication between each GPU. To realize high-speed computation with multiple GPUs, we should minimize the cost of this data communication. In this paper, I propose a multiple GPU computing method for the Swendsen–Wang (SW) multi-cluster algorithm that reduces the data traffic between each GPU. I realize this reduction in data traffic by adjusting the connection information between each GPU in advance. The code is implemented on the large-scale open science TSUBAME 2.5 supercomputer, and its performance is evaluated using a simulation of the three-dimensional Ising model at the critical temperature. The results show that the data communication between each GPU is reduced by 90%, and the number of communications between each GPU decreases by about half. Using 512 GPUs, the computation time is 0.005 ns per spin update at the critical temperature for a total system size of N=40963.

Full Text