Abstract

This paper presents a blocked algorithm for the all-pairs shortest paths (APSP) problem for a hybrid CPU-GPU system. In the blocked APSP algorithm, the amount of data communication between CPU (host) memory and GPU memory is minimized. When a problem size (the number of vertices in a graph) is large enough compared with a blocking factor, the blocked algorithm virtually requires CPU$\rightleftharpoons$GPU exchanging of two block matrices for a block computation on the GPU. We also estimate a required memory/communication bandwidth to utilize the GPU efficiently. On a system containing an Intel West mere CPU (Core i7 970) and an AMD Cypress GPU (Radeon HD 5870), our implementation of the blocked APSP algorithm achieves the performance up to 1 TFlop/s in single precision.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call