Abstract

This chapter describes a distributed job scheduler for grid computing. It is a software tool that provides support to many users who run parallel jobs concurrently on the clusters of heterogeneous computers while maintaining the load balance from both the user's and the system administrator's point of view. The scheduler has a self-healing feature that can recover parallel CFD computation from system and hardware related errors. The scheduler can work with Globus. The unique feature of this scheduler is that it can allocate a set of computers to a parallel job and lets the application do dynamic load balancing on the given computers. The schedulers on different clusters can request resources from one another to fulfill the request of the parallel jobs. The scheduler also provides a user-friendly environment that only requires the user to provide the specific information necessary for job execution. An experiment is described to demonstrate the applicability of the scheduler.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call