Abstract
Achieving fair resource sharing is rapidly becoming an essential requirement in cluster computing systems. Although many fair scheduling algorithms have been proposed in recent decades, controlling resource sharing among jobs on servers remains a challenging problem that, if not handled well, may result in chaotic resource contention and service-level agreement violation of jobs. To address this problem, we propose a resource container–based job management approach for fair resource sharing. In our approach, we first design and implement a general container-based job management module, providing lightweight and fine-grained resource allocation and isolation for job execution. With this module, we propose a resource-aware management scheme to enable fair resource sharing in job scheduling and dispatching. We conduct experiments by implementing the proposed module and applying the scheme on TCluster, a self-developed cluster computing system of a worldwide top Internet corporation. Results show that our approach performs well in guaranteeing fair resource sharing with negligible overhead.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have