Abstract

Conjugate gradient (CG) is one of the most popular iterative approaches to solving large sparse linear systems of equations. This work reports a parallel implementation of CG on clusters with EARTH multithreaded runtime support. Interphase and intraphase communication costs are balanced using a two-dimensional blocking method, minimizing overall communication costs. EARTH'S adaptive, event-driven multithreaded execution model gives additional opportunities to overlap communication and computation to achieve even better scalability. Experiments on a large Beowulf cluster with gigabit Ethernet show notable improvements over other parallel CG implementations. For example, with the NAS CG benchmark problem size Class C, our implementation achieved a speedup of 41 on a 64-node cluster, compared to 13 for the MPl-based NAS version. The results demonstrate that the combination of the two-dimensional blocking method and the EARTH architectural runtime support helps to compensate for the low communications bandwidth common to most clusters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call