Abstract

Scheduling and resource allocation to optimize performance criteria in multi-cluster heterogeneous environments is known as an NP-hard problem, not only for the resource heterogeneity, but also for the possibility of applying co-allocation to take advantage of idle resources across clusters. A common practice is to use basic heuristics to attempt to optimize some performance criteria by treating the jobs in the waiting queue individually. More recent works proposed new optimization strategies based on Linear Programming techniques dealing with the scheduling of multiple jobs simultaneously. However, the time cost of these techniques makes them impractical for large-scale environments. Population-based meta-heuristics have proved their effectiveness for finding the optimal schedules in large-scale distributed environments with high resource diversification and large numbers of jobs in the batches. The algorithm proposed in the present work packages the jobs in the batch to obtain better optimization opportunities. It includes a multi-objective function to optimize not only the Makespan of the batches but also the Flowtime, thus ensuring a certain level of QoS from the users’ point of view. The algorithm also incorporates heterogeneity and bandwidth awareness issues, and is useful for scheduling jobs in large-scale heterogeneous environments. The proposed meta-heuristic was evaluated with a real workload trace. The results show the effectiveness of the proposed method, providing solutions that improve the performance with respect to other well-known techniques in the literature.

Highlights

  • Multi-cluster environments are usually presented as an alternative to high-performance computing for solving large-scale optimization problems by leveraging the computational resources distributed throughout an organization

  • These environments are distinguished from Grid environments in that the multi-cluster uses a dedicated interconnection network between cluster resources with a known topology and predictable performance characteristics, while in Grid, the computing resources are distributed over multiple organizations interconnected through Internet

  • The authors present a novel approach based on a Genetic Algorithms (GA)-based scheduling meta-heuristic for large-scale multi-cluster environments applying co-allocation when necessary

Read more

Summary

Introduction

Multi-cluster environments are usually presented as an alternative to high-performance computing for solving large-scale optimization problems by leveraging the computational resources distributed throughout an organization. These environments are made up of several clusters of computers joined by dedicated interconnection networks (Javadi et al, 2006). The former can find good solutions among all the possible ones but do not guarantee that the best or a near optimal solution will be found These methodologies are faster than traditional exhaustive algorithms but inappropriate for large-scale scheduling problems. We propose a novel approach based on genetic algorithms for solving the parallel job-scheduling problem with co-allocation in heterogenous multicluster environments.

Related work
Parallel job execution model
Problem description
E Gabaldon et al—Multi-criteria genetic algorithm applied to scheduling 289
Genetic algorithm
Chromosome encoding
Fitness function definitions
Genetic operators
Computation node allocation
Node allocation mutation
Genetic algorithm profiling
Experimental evaluation
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call