Abstract

To satisfy future computing demands of the Worldwide LHC Computing Grid (WLCG), opportunistic usage of third-party resources is a promising approach. While the means to make such resources compatible with WLCG requirements are largely satisfied by virtual machines and containers technologies, strategies to acquire and disband many resources from many providers are still a focus of current research. Existing meta-schedulers that manage resources in the WLCG are hitting the limits of their design when tasked to manage heterogeneous resources from many diverse resource providers.To provide opportunistic resources to the WLCG as part of a regular WLCG site, we propose a new meta-scheduling approach suitable for opportunistic, heterogeneous resource provisioning. Instead of anticipating future resource requirements, our approach observes resource usage and promotes well-used resources. Following this approach, we have developed an inherently robust meta-scheduler, COBalD, for managing diverse, heterogeneous resources given unpredictable resource requirements. This paper explains the key concepts of our approach, and discusses the benefits and limitations of our new approach to dynamic resource provisioning compared to previous approaches.

Highlights

  • Dynamic resource provisioning in the Worldwide LHC Computing Grid (WLCG) [1] is commonly based on meta-scheduling and the pilot model [2]: A meta-scheduler pre-computes the ideal set of resources for a given set of workflows; so-called pilot jobs acquire and integrate these resources into an overlay batch system, which processes the initial workflows

  • The GridKa Tier 1 centre has developed a new approach for dynamic provisioning that is suitable for the WLCG and beyond

  • Even though job to resource to job meta-scheduling performs well for homogeneous resources and jobs, we have not been able to apply it to more complex, dynamic cases

Read more

Summary

Introduction

Dynamic resource provisioning in the WLCG [1] is commonly based on meta-scheduling and the pilot model [2]: A meta-scheduler pre-computes the ideal set of resources for a given set of workflows; so-called pilot jobs acquire and integrate these resources into an overlay batch system, which processes the initial workflows. While this approach offers a high level of control and precision, we have found the strong coupling between components to inherently limit scalability, flexibility and robustness. We have successfully used our work for provisiong HPC and Cloud resources to the WLCG, as well as managing abstract resources in the form of multi-core and single-core allocations

Job to Resource to Job Meta-Scheduling
Feedback Control Loop Meta-Scheduling
The COBalD Pool Model
Orthogonality of Job and Meta-Scheduler
Towards Implicit Network Scheduling
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call