Abstract

Clusters are now considered as an alternative to parallel machines to execute workloads made up of sequential and/or parallel applications. For efficient application execution on clusters, dynamic global process scheduling is of prime importance. Different dynamic scheduling policies that have been studied for distributed systems or parallel machines may be used in clusters. The choice of a particular policy depends on the kind of workload to be executed. In a cluster, it is thus highly desirable to implement a configurable global scheduler to be able to adapt the dynamic scheduling policy to the workload characteristics, to take benefit of all cluster resources and to cope with node shutdown and reboot. In this paper, we present the architecture of the global scheduler and the process management mechanisms of Kerrighed, a single system image operating system designed for high performance computing on clusters. Kerrighed provides a development framework allowing to easily implement dynamic scheduling policies without kernel modification. In Kerrighed, the global scheduling policy can be dynamically changed while applications execute on the cluster Kerrighed's process management mechanisms allow to easily deploy parallel applications in the cluster and to efficiently migrate or checkpoint processes, including processes sharing memory. Kerrighed has been implemented as a set of modules extending Linux kernel. Preliminary performance results are presented.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call