Abstract

Many grand challenge applications can benefit from metacomputing, i.e., the coordinated use of geographically distributed heterogeneous supercomputers. A salient feature of such systems is the heterogeneity in the network performance between different processor pairs. This paper considers the problem of efficient application-level communication in heterogeneous network-based systems. We present a uniform communication scheduling framework for developing adaptive communication schedules for various collective communication patterns. The framework enables schedules to be developed at runtime, based on network performance information obtained from a directory service. Based on this framework, we have developed communication schedules for the total exchange communication pattern. Our first algorithm develops a schedule by computing a series of matchings in a bipartite graph. We also present a heuristic algorithm based on the open shop scheduling problem. The completion time of the heuristic is guaranteed to be within twice the optimal. Simulation results show performance improvements by a factor of 5 over well-known homogeneous scheduling techniques. This paper is an early effort in formalizing and solving communication problems for metacomputing systems. We discuss several research issues that must be addressed to allow efficient collective communication in such environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call