Abstract
Conjugate gradient (CG) is one of the most popular iterative approaches to solving large sparse linear systems of equations. This work reports a parallel implementation of CG on clusters with EARTH multithreaded runtime support. Interphase and intraphase communication costs are balanced using a two-dimensional blocking method, minimizing overall communication costs. EARTH'S adaptive, event-driven multithreaded execution model gives additional opportunities to overlap communication and computation to achieve even better scalability. Experiments on a large Beowulf cluster with gigabit Ethernet show notable improvements over other parallel CG implementations. For example, with the NAS CG benchmark problem size Class C, our implementation achieved a speedup of 41 on a 64-node cluster, compared to 13 for the MPl-based NAS version. The results demonstrate that the combination of the two-dimensional blocking method and the EARTH architectural runtime support helps to compensate for the low communications bandwidth common to most clusters.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have